Invalid character in search results

2007-12-04 Thread Maciej Szczytowski
Hi, I use Solr 1.1 application for indexing russian documents. Sometimes 
I've got as search results docs with invalid character.


For example I've indexed иго but search returned и��о. It's strange 
because something has changed 2 bytes into 6 bytes.


иго - D0 B8 D0 B3 D0 BE

и��о - D0 B8 EF BF BD EF BF BD D0 BE

This field is indexed as string verbatim.

fieldtype name=string class=solr.StrField sortMissingLast=true 
omitNorms=true/


After reindexing documents with invalid character are fixed.

Has anybody idea where is the problem?

Maciek


Issues using keyword searching and facet search together in a search operation

2007-12-04 Thread Dilip.TS

Hi,
 When i use both the Keyword search and the facet search together in a same
search operation,
 I dont get any results whereas if i perform them seperately, i could get
back the results.
 Is it a constraint from the SOLR point of view?

Thanks in advance.


Regards,
Dilip TS




Re: Issues using keyword searching and facet search together in a search operation

2007-12-04 Thread Erick Erickson
I can't answer the question, but I *can* guarantee that
the people who can will give you *much* better
responses if you include some details. Like which
analyzers you use, how you submit the query,
samples of the two queries that work and the
one that doesn't.

Imagine you're on the receiving end if this question and
ask is there enough info here to make a meaningful
analysis G...

Best
Erick

On Dec 4, 2007 5:39 AM, Dilip.TS [EMAIL PROTECTED] wrote:


 Hi,
  When i use both the Keyword search and the facet search together in a
 same
 search operation,
  I dont get any results whereas if i perform them seperately, i could get
 back the results.
  Is it a constraint from the SOLR point of view?

 Thanks in advance.


 Regards,
 Dilip TS





RE: Issues using keyword searching and facet search together in a search operation

2007-12-04 Thread Dilip.TS
 Hi,

Considering the following scenario where i need to use keyword search on
fields title and description with the keyword typed as  testing
And using the search on fields price, publisher and  tag , the fields
publisher and tag  being selected for the facet searching
 If the constructed queryString using the above scenario is something like
this:


facet.limit=-1rows=100start=0facet=truefacet.mincount=1facet.field=tag
facet.field=publisherq=title:testing+OR+description:testing+AND+title:lucen
er;title+asc,score+ascqt=standard 0 15

Im using the following analyzers:
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
 analyzer type=query
   tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
   filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=0 catenateNumbers=0
catenateAll=0/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer


 Regards
 Dilip


  -Original Message-
  From: Erick Erickson [mailto:[EMAIL PROTECTED]
  Sent: Tuesday, December 04, 2007 5:30 PM
  To: solr-user@lucene.apache.org; [EMAIL PROTECTED]
  Subject: Re: Issues using keyword searching and facet search together in a
search operation


  I can't answer the question, but I can guarantee that
  the people who can will give you much better
  responses if you include some details. Like which
  analyzers you use, how you submit the query,
  samples of the two queries that work and the
  one that doesn't.


  Best
  Erick


  On Dec 4, 2007 5:39 AM, Dilip.TS [EMAIL PROTECTED] wrote:


Hi,
 When i use both the Keyword search and the facet search together in a
same
search operation,
 I dont get any results whereas if i perform them seperately, i could
get
back the results.
 Is it a constraint from the SOLR point of view?

Thanks in advance.


Regards,
Dilip TS






Field seperater for highlighting multi-value fields

2007-12-04 Thread Wagner,Harry
Hi,

The default field separator seems to be a '.' when highlighting
multi-value fields. Can this be overridden in 1.2 to another character?

 

Thanks!

harry



Re: Issues using keyword searching and facet search together in a search operation

2007-12-04 Thread Yonik Seeley
On Dec 4, 2007 5:39 AM, Dilip.TS [EMAIL PROTECTED] wrote:
  When i use both the Keyword search and the facet search together in a same
 search operation,
  I dont get any results whereas if i perform them seperately, i could get
 back the results.

add debugQuery=on to your requests (and change rows to something small
like 5), and then post the results of both URLs here.

-Yonik


Re: Invalid character in search results

2007-12-04 Thread Yonik Seeley
On Dec 4, 2007 5:02 AM, Maciej Szczytowski
[EMAIL PROTECTED] wrote:
 Hi, I use Solr 1.1 application for indexing russian documents. Sometimes
 I've got as search results docs with invalid character.

 For example I've indexed иго but search returned и��о. It's strange
 because something has changed 2 bytes into 6 bytes.

 иго - D0 B8 D0 B3 D0 BE

 и��о - D0 B8 EF BF BD EF BF BD D0 BE

 This field is indexed as string verbatim.

 fieldtype name=string class=solr.StrField sortMissingLast=true
 omitNorms=true/

 After reindexing documents with invalid character are fixed.

 Has anybody idea where is the problem?

Probably an issue with the charset not being set correctly (or the
character encoding not matching the charset declaration) when it was
first indexed.

-Yonik


Re: out of heap space, every day

2007-12-04 Thread Brian Whitman


For faceting and sorting, yes.  For normal search, no.



Interesting you mention that, because one of the other changes since  
last week besides the index growing is that we added a sort to an  
sint field on the queries.


Is it reasonable that a sint sort would require over 2.5GB of heap on  
a 8M index? Is there any empirical data on how much RAM that will need?







Re: out of heap space, every day

2007-12-04 Thread Yonik Seeley
On Dec 4, 2007 10:59 AM, Brian Whitman [EMAIL PROTECTED] wrote:
 
  For faceting and sorting, yes.  For normal search, no.
 

 Interesting you mention that, because one of the other changes since
 last week besides the index growing is that we added a sort to an
 sint field on the queries.

 Is it reasonable that a sint sort would require over 2.5GB of heap on
 a 8M index? Is there any empirical data on how much RAM that will need?

int[maxDoc()] + String[nTerms()] + size_of_all_unique_terms.
Then double that to allow for a warming searcher.

One can decrease this memory usage by using an integer instead of an
sint field if you don't need range queries.  The memory usage would
then drop to a straight int[maxDoc()] (4 bytes per document).

-Yonik


Re: out of heap space, every day

2007-12-04 Thread Yonik Seeley
On Dec 4, 2007 10:46 AM, Brian Whitman [EMAIL PROTECTED] wrote:
 Are there 'native' memory requirements for solr as a function of
 index size?

For faceting and sorting, yes.  For normal search, no.

-Yonik


out of heap space, every day

2007-12-04 Thread Brian Whitman
This maybe more of a general java q than a solr one, but I'm a bit  
confused.


We have a largish solr index, about 8M documents, the data dir is  
about 70G. We're getting about 500K new docs a week, as well as about  
1 query/second.


Recently (when we crossed about the 6M threshold) resin has been  
stopping with the following:


/usr/local/resin/log/stdout.log:[12:08:21.749] [28304] HTTP/1.1 500  
Java heap space
/usr/local/resin/log/stdout.log:[12:08:21.749]  
java.lang.OutOfMemoryError: Java heap space


Only a restart of resin will get it going again, and then it'll crash  
again within 24 hours.


It's a 4GB machine and we run it with args=-J-mx2500m -J-ms2000m We  
can't really raise this any higher on the machine.


Are there 'native' memory requirements for solr as a function of  
index size? Does a 70GB index require some minimum amount of wired  
RAM? Or is there some mis-configuration w/ resin or solr or my  
system? I don't really know Java well but it seems strange that the  
VM can't page RAM out to disk or really do something else beside  
stopping the server.












Re: out of heap space, every day

2007-12-04 Thread Brian Carmalt

Hello,

I am also fighting with heap exhaustion, however during the indexing 
step. I was able to minimize, but not fix the problem
by setting the thread stack size to 64k with -Xss64k. The minimum size 
is os specific, but the VM will tell

you if you set the size too small. You can try it, it may help

Brian

Brian Whitman schrieb:
This maybe more of a general java q than a solr one, but I'm a bit 
confused.


We have a largish solr index, about 8M documents, the data dir is 
about 70G. We're getting about 500K new docs a week, as well as about 
1 query/second.


Recently (when we crossed about the 6M threshold) resin has been 
stopping with the following:


/usr/local/resin/log/stdout.log:[12:08:21.749] [28304] HTTP/1.1 500 
Java heap space
/usr/local/resin/log/stdout.log:[12:08:21.749] 
java.lang.OutOfMemoryError: Java heap space


Only a restart of resin will get it going again, and then it'll crash 
again within 24 hours.


It's a 4GB machine and we run it with args=-J-mx2500m -J-ms2000m We 
can't really raise this any higher on the machine.


Are there 'native' memory requirements for solr as a function of index 
size? Does a 70GB index require some minimum amount of wired RAM? Or 
is there some mis-configuration w/ resin or solr or my system? I don't 
really know Java well but it seems strange that the VM can't page RAM 
out to disk or really do something else beside stopping the server.




Cache use

2007-12-04 Thread Evgeniy Strokin
Hello,...
we have 110M records index under Solr. Some queries takes a while, but we need 
sub-second results. I guess the only solution is cache (something else?)...
We use standard LRUCache. In docs it says (as far as I understood) that it 
loads view of index in to memory and next time works with memory instead of 
hard drive.
So, my question: hypothetically, we can have all index in memory if we'd have 
enough memory size, right? In this case the result should come up very fast. We 
have very rear updates. So I think this could be a solution.
How should I configure the cache to achieve such approach?
Thanks for any advise.
Gene

Re: Cache use

2007-12-04 Thread Dennis Kubes
One way to do this if you are running on linux is to create a tempfs 
(which is ram) and then mount the filesystem in the ram.  Then your 
index acts normally to the application but is essentially served from 
Ram.  This is how we server the Nutch lucene indexes on our web search 
engine (www.visvo.com) which is ~100M pages.  Below is how you can 
achieve this, assuming your indexes are in /path/to/indexes:



mv /path/to/indexes /path/to/indexes.dist
mkdir /path/to/indexes
cd /path/to
mount -t tmpfs -o size=2684354560 none /path/to/indexes
rsync --progress -aptv indexes.dist/* indexes/
chown -R user:group indexes

This would of course be limited by the amount of RAM you have on the 
machine.  But with this approach most searches are sub-second.


Dennis Kubes

Evgeniy Strokin wrote:

Hello,...
we have 110M records index under Solr. Some queries takes a while, but we need 
sub-second results. I guess the only solution is cache (something else?)...
We use standard LRUCache. In docs it says (as far as I understood) that it 
loads view of index in to memory and next time works with memory instead of 
hard drive.
So, my question: hypothetically, we can have all index in memory if we'd have 
enough memory size, right? In this case the result should come up very fast. We 
have very rear updates. So I think this could be a solution.
How should I configure the cache to achieve such approach?
Thanks for any advise.
Gene


Tomcat6 env-entry

2007-12-04 Thread Gary Harris
It works excellently in Tomcat 6. The toughest thing I had to deal with is 
discovering that the environment variable in web.xml for solr/home is 
essential. If you skip that step, it won't come up.


   env-entry
   env-entry-namesolr/home/env-entry-name
   env-entry-typejava.lang.String/env-entry-type
   env-entry-valueF:\Tomcat-6.0.14\webapps\solr/env-entry-value
   /env-entry

- Original Message - 
From: Charlie Jackson [EMAIL PROTECTED]

To: solr-user@lucene.apache.org
Sent: Monday, December 03, 2007 11:35 AM
Subject: RE: Tomcat6?


$CALINA_HOME/conf/Catalina/localhost doesn't exist by default, but you can 
create it and it will work exactly the same way it did in Tomcat 5. It's not 
created by default because its not needed by the manager webapp anymore.



-Original Message-
From: Matthew Runo [mailto:[EMAIL PROTECTED]
Sent: Monday, December 03, 2007 10:15 AM
To: solr-user@lucene.apache.org
Subject: Re: Tomcat6?

In context.xml, I added..

Environment name=/solr/home value=/Users/mruno/solr-src/example/
solr type=java.lang.String /

I think that's all I did to get it working in Tocmat 6.

--Matthew Runo

On Dec 3, 2007, at 7:58 AM, Jörg Kiegeland wrote:


In the Solr wiki, there is not described how to install Solr on
Tomcat 6, and I not managed it myself :(
In the chapter Configuring Solr Home with JNDI there is mentioned
the directory $CATALINA_HOME/conf/Catalina/localhost , which not
exists with TOMCAT 6.

Alternatively I tried the folder $CATALINA_HOME/work/Catalina/
localhost, but with no success.. (I can query the top level page,
but the Solr Admin link then not works).

Can anybody help?

--
Dipl.-Inf. Jörg Kiegeland
ikv++ technologies ag
Bernburger Strasse 24-25, D-10963 Berlin
e-mail: [EMAIL PROTECTED], web: http://www.ikv.de
phone: +49 30 34 80 77 18, fax: +49 30 34 80 78 0
=
Handelsregister HRB 81096; Amtsgericht Berlin-Charlottenburg
board of  directors: Dr. Olaf Kath (CEO); Dr. Marc Born (CTO)
supervising board: Prof. Dr. Bernd Mahr (chairman)
_




--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.503 / Virus Database: 269.16.12/1162 - Release Date: 11/30/2007 
9:26 PM





Re: Cache use

2007-12-04 Thread Yonik Seeley
The first step is to look at what searches are taking too long, and
see if there is a way to structure them so they don't take as long.

The whole index doesn't have to be in memory to get good search
performance, but 100M documents on a single server is big.  We are
working on distributed search (SOLR-303) so an index can be split
across multiple servers.

-Yonik

On Dec 4, 2007 11:43 AM, Evgeniy Strokin [EMAIL PROTECTED] wrote:
 Hello,...
 we have 110M records index under Solr. Some queries takes a while, but we 
 need sub-second results. I guess the only solution is cache (something 
 else?)...
 We use standard LRUCache. In docs it says (as far as I understood) that it 
 loads view of index in to memory and next time works with memory instead of 
 hard drive.
 So, my question: hypothetically, we can have all index in memory if we'd have 
 enough memory size, right? In this case the result should come up very fast. 
 We have very rear updates. So I think this could be a solution.
 How should I configure the cache to achieve such approach?
 Thanks for any advise.
 Gene


SOLR 1.3 trunk error

2007-12-04 Thread Matthew Runo

Hello!

I'm trying to make use of SOLR 1.3, svn trunk, and get the following  
error.


SEVERE: java.lang.NoSuchMethodError:  
org.apache.solr.search.QParser.getSort(Z)Lorg/apache/solr/search/ 
QueryParsing$SortSpec;
	at  
org 
.apache 
.solr.handler.component.QueryComponent.prepare(QueryComponent.java:66)
	at  
org 
.apache 
.solr.handler.SearchHandler.handleRequestBody(SearchHandler.java:93)
	at  
org 
.apache 
.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 
117)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:826)
	at  
org 
.apache 
.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:206)
	at  
org 
.apache 
.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:174)
	at  
org 
.apache 
.catalina 
.core 
.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java: 
235)
	at  
org 
.apache 
.catalina 
.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
	at  
org 
.apache 
.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java: 
233)
	at  
org 
.apache 
.catalina.core.StandardContextValve.invoke(StandardContextValve.java: 
175)
	at  
org 
.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java: 
128)
	at  
org 
.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java: 
102)
	at  
org 
.apache 
.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
	at  
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java: 
263)
	at  
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java: 
844)
	at org.apache.coyote.http11.Http11Protocol 
$Http11ConnectionHandler.process(Http11Protocol.java:584)
	at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java: 
447)

at java.lang.Thread.run(Thread.java:619)

--Matthew


Re: SOLR 1.3 trunk error

2007-12-04 Thread Matthew Runo
Ooops, I get this error when I try to search an index with a few  
documents in it.


ie..

http://dev14.zappos.com:8080/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on

caching : true
numDocs : 5
maxDoc : 5
readerImpl : MultiReader
readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index
indexVersion : 1196707950551
openedAt : Tue Dec 04 10:14:58 PST 2007
registeredAt : Tue Dec 04 10:14:58 PST 2007

On Dec 4, 2007, at 10:19 AM, Matthew Runo wrote:


Hello!

I'm trying to make use of SOLR 1.3, svn trunk, and get the following  
error.


SEVERE: java.lang.NoSuchMethodError:  
org.apache.solr.search.QParser.getSort(Z)Lorg/apache/solr/search/ 
QueryParsing$SortSpec;
	at  
org 
.apache 
.solr.handler.component.QueryComponent.prepare(QueryComponent.java:66)
	at  
org 
.apache 
.solr.handler.SearchHandler.handleRequestBody(SearchHandler.java:93)
	at  
org 
.apache 
.solr 
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:117)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:826)
	at  
org 
.apache 
.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:206)
	at  
org 
.apache 
.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:174)
	at  
org 
.apache 
.catalina 
.core 
.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java: 
235)
	at  
org 
.apache 
.catalina 
.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
	at  
org 
.apache 
.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java: 
233)
	at  
org 
.apache 
.catalina.core.StandardContextValve.invoke(StandardContextValve.java: 
175)
	at  
org 
.apache 
.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
	at  
org 
.apache 
.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
	at  
org 
.apache 
.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java: 
109)
	at  
org 
.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java: 
263)
	at  
org 
.apache.coyote.http11.Http11Processor.process(Http11Processor.java: 
844)
	at org.apache.coyote.http11.Http11Protocol 
$Http11ConnectionHandler.process(Http11Protocol.java:584)
	at org.apache.tomcat.util.net.JIoEndpoint 
$Worker.run(JIoEndpoint.java:447)

at java.lang.Thread.run(Thread.java:619)

--Matthew





Re: out of heap space, every day

2007-12-04 Thread Mike Klaas

On 4-Dec-07, at 8:10 AM, Brian Carmalt wrote:


Hello,

I am also fighting with heap exhaustion, however during the  
indexing step. I was able to minimize, but not fix the problem
by setting the thread stack size to 64k with -Xss64k. The minimum  
size is os specific, but the VM will tell

you if you set the size too small. You can try it, it may help


This seems surprising unless you are positively hammering Solr with  
tons of different threads during indexing.  It's probably not worth  
using more than # processors + a few.


-Mike


Re: SOLR 1.3 trunk error

2007-12-04 Thread Ryan McKinley

did you try 'ant clean' before running 'ant dist'?

the method signature for SortSpec changed recently


Matthew Runo wrote:
Ooops, I get this error when I try to search an index with a few 
documents in it.


ie..

http://dev14.zappos.com:8080/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on 



caching : true
numDocs : 5
maxDoc : 5
readerImpl : MultiReader
readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index
indexVersion : 1196707950551
openedAt : Tue Dec 04 10:14:58 PST 2007
registeredAt : Tue Dec 04 10:14:58 PST 2007

On Dec 4, 2007, at 10:19 AM, Matthew Runo wrote:


Hello!

I'm trying to make use of SOLR 1.3, svn trunk, and get the following 
error.


SEVERE: java.lang.NoSuchMethodError: 
org.apache.solr.search.QParser.getSort(Z)Lorg/apache/solr/search/QueryParsing$SortSpec; 

at 
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:66) 

at 
org.apache.solr.handler.SearchHandler.handleRequestBody(SearchHandler.java:93) 

at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:117) 


at org.apache.solr.core.SolrCore.execute(SolrCore.java:826)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:206) 

at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:174) 

at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) 

at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) 

at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) 

at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) 

at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) 

at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) 

at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) 

at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263) 

at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) 

at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584) 

at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)

at java.lang.Thread.run(Thread.java:619)

--Matthew








Re: SOLR 1.3 trunk error

2007-12-04 Thread Matthew Runo

Wow. So I feel stupid. Sorry to waste your time =p

--Matthew

On Dec 4, 2007, at 10:36 AM, Ryan McKinley wrote:


did you try 'ant clean' before running 'ant dist'?

the method signature for SortSpec changed recently


Matthew Runo wrote:
Ooops, I get this error when I try to search an index with a few  
documents in it.

ie..
http://dev14.zappos.com:8080/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on 
 caching : true

numDocs : 5
maxDoc : 5
readerImpl : MultiReader
readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index
indexVersion : 1196707950551
openedAt : Tue Dec 04 10:14:58 PST 2007
registeredAt : Tue Dec 04 10:14:58 PST 2007
On Dec 4, 2007, at 10:19 AM, Matthew Runo wrote:

Hello!

I'm trying to make use of SOLR 1.3, svn trunk, and get the  
following error.


SEVERE: java.lang.NoSuchMethodError:  
org.apache.solr.search.QParser.getSort(Z)Lorg/apache/solr/search/ 
QueryParsing$SortSpec;
   at  
org 
.apache 
.solr.handler.component.QueryComponent.prepare(QueryComponent.java: 
66)
   at  
org 
.apache 
.solr.handler.SearchHandler.handleRequestBody(SearchHandler.java:93)
   at  
org 
.apache 
.solr 
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 
117)

   at org.apache.solr.core.SolrCore.execute(SolrCore.java:826)
   at  
org 
.apache 
.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java: 
206)
   at  
org 
.apache 
.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java: 
174)
   at  
org 
.apache 
.catalina 
.core 
.ApplicationFilterChain 
.internalDoFilter(ApplicationFilterChain.java:235)
   at  
org 
.apache 
.catalina 
.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java: 
206)
   at  
org 
.apache 
.catalina 
.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
   at  
org 
.apache 
.catalina 
.core.StandardContextValve.invoke(StandardContextValve.java:175)
   at  
org 
.apache 
.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
   at  
org 
.apache 
.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
   at  
org 
.apache 
.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java: 
109)
   at  
org 
.apache 
.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
   at  
org 
.apache.coyote.http11.Http11Processor.process(Http11Processor.java: 
844)
   at org.apache.coyote.http11.Http11Protocol 
$Http11ConnectionHandler.process(Http11Protocol.java:584)
   at org.apache.tomcat.util.net.JIoEndpoint 
$Worker.run(JIoEndpoint.java:447)

   at java.lang.Thread.run(Thread.java:619)

--Matthew







Re: Cache use

2007-12-04 Thread Mike Klaas

On 4-Dec-07, at 8:43 AM, Evgeniy Strokin wrote:


Hello,...
we have 110M records index under Solr. Some queries takes a while,  
but we need sub-second results. I guess the only solution is cache  
(something else?)...
We use standard LRUCache. In docs it says (as far as I understood)  
that it loads view of index in to memory and next time works with  
memory instead of hard drive.
So, my question: hypothetically, we can have all index in memory if  
we'd have enough memory size, right? In this case the result should  
come up very fast. We have very rear updates. So I think this could  
be a solution.


How big is the index on disk (the most important files are .frq,  
and .prx if you do phrase queries?  How big and what exactly is a  
record in your system?  Do you do faceting/sorting?   How much memory  
do you have?  What does a typical query look like?


Performance is a tricky subject.  It is hard to give any kind of  
useful answer that applies in general.  The one thing I can say is  
that 110M is a _lot_ of docs for one system, especially if these are  
normal-sized documents


regards,
-Mike


Re: Cache use

2007-12-04 Thread evgeniy . strokin
Any suggestions are helpful to me,. even general.. Here is the info from my 
index:
How big is the index on disk (the most important files are .frq,  
and .prx if you do phrase queries?  
- Total index folder size is 30.7 Gb
- .frq is 12.2 Gb
- .prx is 6 Gb
 
How big and what exactly is a record in your system?  
- record is a document with 100 fields indexed and 10 of them stored. 
Approximately 60% of fields are containing data.
 
Do you do faceting/sorting?  
- Yes, I'm planing to do both.
 
How much memory do you have?  
- I do have 8Gb of RAM I could get up to 16Gb
 
What does a typical query look like?
- I don't know yet. We are in prototype mode. We try everything possible. In 
general we are able to get results in sub-second. But some queries take long, 
for example TOWN:L* I know this is very broad query, and probably the worst 
one. But we could need such queries to get quantity of such towns with name 
starting with L, for example. Cache helps a little, for example after this 
query if I run TOWN:La* I'm getting result in milliseconds.
But what wonders me is: if I'm running query like this: TOWN:L* OR STREET:S* 
I'm guessing it should cache all data of this set. If after I run just TOWN:L* 
, which is subset of the first query, it still takes time to get the result 
back, as if it's not cached.. 


- Original Message 
From: Mike Klaas [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Tuesday, December 4, 2007 2:33:24 PM
Subject: Re: Cache use

On 4-Dec-07, at 8:43 AM, Evgeniy Strokin wrote:

 Hello,...
 we have 110M records index under Solr. Some queries takes a while,  
 but we need sub-second results. I guess the only solution is cache  
 (something else?)...
 We use standard LRUCache. In docs it says (as far as I understood)  
 that it loads view of index in to memory and next time works with  
 memory instead of hard drive.
 So, my question: hypothetically, we can have all index in memory if  
 we'd have enough memory size, right? In this case the result should  
 come up very fast. We have very rear updates. So I think this could  
 be a solution.

How big is the index on disk (the most important files are .frq,  
and .prx if you do phrase queries?  How big and what exactly is a  
record in your system?  Do you do faceting/sorting?  How much memory  
do you have?  What does a typical query look like?

Performance is a tricky subject.  It is hard to give any kind of  
useful answer that applies in general.  The one thing I can say is  
that 110M is a _lot_ of docs for one system, especially if these are  
normal-sized documents

regards,
-Mike

Re: Cache use

2007-12-04 Thread evgeniy . strokin
Thanks, this is very interesting idea. But my index folder is about 30Gb. Max 
RAM I could get probably is 16Gb. Rest could be in swap, but I think it will 
kill the whole idea.. May be it will be useful to put just some files from 
index folder to RAM? If this is possible at all))...


- Original Message 
From: Dennis Kubes [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Tuesday, December 4, 2007 12:00:55 PM
Subject: Re: Cache use

One way to do this if you are running on linux is to create a tempfs 
(which is ram) and then mount the filesystem in the ram.  Then your 
index acts normally to the application but is essentially served from 
Ram.  This is how we server the Nutch lucene indexes on our web search 
engine (www.visvo.com) which is ~100M pages.  Below is how you can 
achieve this, assuming your indexes are in /path/to/indexes:


mv /path/to/indexes /path/to/indexes.dist
mkdir /path/to/indexes
cd /path/to
mount -t tmpfs -o size=2684354560 none /path/to/indexes
rsync --progress -aptv indexes.dist/* indexes/
chown -R user:group indexes

This would of course be limited by the amount of RAM you have on the 
machine.  But with this approach most searches are sub-second.

Dennis Kubes

Evgeniy Strokin wrote:
 Hello,...
 we have 110M records index under Solr. Some queries takes a while, but we 
 need sub-second results. I guess the only solution is cache (something 
 else?)...
 We use standard LRUCache. In docs it says (as far as I understood) that it 
 loads view of index in to memory and next time works with memory instead of 
 hard drive.
 So, my question: hypothetically, we can have all index in memory if we'd have 
 enough memory size, right? In this case the result should come up very fast. 
 We have very rear updates. So I think this could be a solution.
 How should I configure the cache to achieve such approach?
 Thanks for any advise.
 Gene

RE: How to delete records that don't contain a field?

2007-12-04 Thread Norskog, Lance
Oops, I should explain.  *:* means all records. This trick puts a
positive query in front of your negative query, and that allows it to
work.

Lance 

-Original Message-
From: Rob Casson [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 04, 2007 7:44 AM
To: solr-user@lucene.apache.org
Subject: Re: How to delete records that don't contain a field?

i'm using this:

deletequery*:* -[* TO *]/query/delete

which is what lance suggested..works just fine.

fyi: https://issues.apache.org/jira/browse/SOLR-381

On Dec 3, 2007 8:09 PM, Norskog, Lance [EMAIL PROTECTED] wrote:
 Wouldn't this be: *:* AND negative query


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik 
 Seeley
 Sent: Monday, December 03, 2007 2:23 PM
 To: solr-user@lucene.apache.org
 Subject: Re: How to delete records that don't contain a field?

 On Dec 3, 2007 5:22 PM, Jeff Leedy [EMAIL PROTECTED] wrote:

  I was wondering if there was a way to post a delete query using curl

  to delete all records that do not contain a certain field--something

  like
  this:
 
  curl http://localhost:8080/solr/update --data-binary
  'deletequery-_title:[* TO *]/query/delete' -H 
  'Content-type:text/xml; charset=utf-8'
 
  The minus syntax seems to return the correct list of ids (that is, 
  all

  records that do not contain the _title field) when I use the Solr 
  administrative console to do the above query, so I'm wondering if 
  Solr

  just doesn't support this type of delete.


 Not yet... it makes sense to support this in the future though.

 -Yonik



RE: out of heap space, every day

2007-12-04 Thread Norskog, Lance
Thanks!

I've seen a few formulae like this go by over the months. Can someone
please make a wiki page for memory and processing estimation with
locality properties?  Or is there a Lucene page we can use?

Lance 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
Seeley
Sent: Tuesday, December 04, 2007 8:06 AM
To: solr-user@lucene.apache.org
Subject: Re: out of heap space, every day

On Dec 4, 2007 10:59 AM, Brian Whitman [EMAIL PROTECTED] wrote:
 
  For faceting and sorting, yes.  For normal search, no.
 

 Interesting you mention that, because one of the other changes since 
 last week besides the index growing is that we added a sort to an sint

 field on the queries.

 Is it reasonable that a sint sort would require over 2.5GB of heap on 
 a 8M index? Is there any empirical data on how much RAM that will
need?

int[maxDoc()] + String[nTerms()] + size_of_all_unique_terms.
Then double that to allow for a warming searcher.

One can decrease this memory usage by using an integer instead of an
sint field if you don't need range queries.  The memory usage would
then drop to a straight int[maxDoc()] (4 bytes per document).

-Yonik


Re: out of heap space, every day

2007-12-04 Thread Brian Whitman


int[maxDoc()] + String[nTerms()] + size_of_all_unique_terms.
Then double that to allow for a warming searcher.



This is great, but can you help me parse this? Assume 8M docs and I'm  
sorting on an int field that is unix time (seonds since epoch.) For  
the purposes of the experiment assume every doc was indexed at a  
unique time.


so..

(int[800] + String[800], each term is 16 chars + 800*4) * 2

that's 384MB by my calculation. Is that right?




RE: out of heap space, every day

2007-12-04 Thread Norskog, Lance
String[nTerms()]: Does this mean that you compare the first term, then
the second, etc.? Otherwise I don't understand how to compare multiple
terms in two records.

Lance 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
Seeley
Sent: Tuesday, December 04, 2007 8:06 AM
To: solr-user@lucene.apache.org
Subject: Re: out of heap space, every day

On Dec 4, 2007 10:59 AM, Brian Whitman [EMAIL PROTECTED] wrote:
 
  For faceting and sorting, yes.  For normal search, no.
 

 Interesting you mention that, because one of the other changes since 
 last week besides the index growing is that we added a sort to an sint

 field on the queries.

 Is it reasonable that a sint sort would require over 2.5GB of heap on 
 a 8M index? Is there any empirical data on how much RAM that will
need?

int[maxDoc()] + String[nTerms()] + size_of_all_unique_terms.
Then double that to allow for a warming searcher.

One can decrease this memory usage by using an integer instead of an
sint field if you don't need range queries.  The memory usage would
then drop to a straight int[maxDoc()] (4 bytes per document).

-Yonik


Re: out of heap space, every day

2007-12-04 Thread Yonik Seeley
On Dec 4, 2007 3:11 PM, Norskog, Lance [EMAIL PROTECTED] wrote:
 String[nTerms()]: Does this mean that you compare the first term, then
 the second, etc.? Otherwise I don't understand how to compare multiple
 terms in two records.

Lucene sorting only supports a single term per document for a field.
The String array stores all the value of all the unique terms (so
nTerms() above should be numberUniqueTerms)

See Lucene's FieldCache.StringIndex

-Yonik


Re: out of heap space, every day

2007-12-04 Thread Charles Hornberger
 See Lucene's FieldCache.StringIndex

To understand just what's getting stored for each string field, you
may also want to look at the createValue() method of the inner Cache
object instantiated as stringsIndexCache in FieldCacheImpl.java (line
399 in HEAD):

http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/search/FieldCacheImpl.java?view=markup

-Charlie


Re: Cache use

2007-12-04 Thread Matthew Phillips
Thanks for the suggestion, Dennis. I decided to implement this as you 
described on my collection of about 400,000 documents, but I did not 
receive the results I expected.


Prior to putting the indexes on a tmpfs, I did a bit of benchmarking and 
found that it usually takes a little under two seconds for each facet 
query. After moving my indexes from disk to a tmpfs file system, I seem 
to get about the same result from facet queries: about two seconds.


Does anyone have any insight into this? Doesn't it seem odd that my 
response times are about the same? Thanks for the help.


Matt Phillips

Dennis Kubes wrote:
One way to do this if you are running on linux is to create a tempfs 
(which is ram) and then mount the filesystem in the ram.  Then your 
index acts normally to the application but is essentially served from 
Ram.  This is how we server the Nutch lucene indexes on our web search 
engine (www.visvo.com) which is ~100M pages.  Below is how you can 
achieve this, assuming your indexes are in /path/to/indexes:



mv /path/to/indexes /path/to/indexes.dist
mkdir /path/to/indexes
cd /path/to
mount -t tmpfs -o size=2684354560 none /path/to/indexes
rsync --progress -aptv indexes.dist/* indexes/
chown -R user:group indexes

This would of course be limited by the amount of RAM you have on the 
machine.  But with this approach most searches are sub-second.


Dennis Kubes

Evgeniy Strokin wrote:

Hello,...
we have 110M records index under Solr. Some queries takes a while, but 
we need sub-second results. I guess the only solution is cache 
(something else?)...
We use standard LRUCache. In docs it says (as far as I understood) 
that it loads view of index in to memory and next time works with 
memory instead of hard drive.
So, my question: hypothetically, we can have all index in memory if 
we'd have enough memory size, right? In this case the result should 
come up very fast. We have very rear updates. So I think this could be 
a solution.

How should I configure the cache to achieve such approach?
Thanks for any advise.
Gene


Re: out of heap space, every day

2007-12-04 Thread Charles Hornberger
It seems to me that another way to write the formula -- borrowing
Python syntax -- is:

4 * numDocs + 38 * len(uniqueTerms) + 2 * sum([len(t) for t in uniqueTerms])

That's 4 bytes per document, plus 38 bytes per term, plus 2 bytes *
the sum of the lengths of the terms. (Numbers taken from
http://martin.nobilitas.com/java/sizeof.html)

Does that seem right?

-Charlie

On Dec 4, 2007 12:31 PM, Charles Hornberger
[EMAIL PROTECTED] wrote:
  See Lucene's FieldCache.StringIndex

 To understand just what's getting stored for each string field, you
 may also want to look at the createValue() method of the inner Cache
 object instantiated as stringsIndexCache in FieldCacheImpl.java (line
 399 in HEAD):

 http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/search/FieldCacheImpl.java?view=markup

 -Charlie



Re: synonyms

2007-12-04 Thread Laurent Gilles
Hi,

I had to work with this kind of sides effects reguarding multiwords synonyms.
We installed solr on our project that extensively uses synonyms, a big
list that sometimes could bring out some wrong match as the one
noticed by Anuvenk
for instance

 dui = drunk driving defense
  or
 dui,drunk driving defense,drunk driving law
 query for dui matches dui = drunk driving defense and dui,drunk driving 
 defense,drunk driving law

in order to prevent this kind of behavior I gave for every synonyms
family (saying a single line in the file) a unique identifier,
so the list looks like :

dui = HIER_FAMILIY_01
drunk driving defense = HIER_FAMILIY_01
SYN_FAMILY_01, dui,drunk driving defense,drunk driving law

I also set the synonyms filter at index time with expand=false, and at
query time with expand=false

so in this way, the matched synonyms (multi words or single words) in
documents are replaced with their family identifier, and not all the
possibilities. Indexing with expand=true will add words in documents
that could be matched alone, ignoring the fact that they belong to
multiwords expression, and this could end up with a wrong match
(intending syns mix) at query time.

so in this way a query for dui, will be changed by the synonym
filter at query time with HIER_FAMILIY_01 or SYN_FAMILY_01 so
documents that contains only single words like drunk, driving or
law will not be matched since only a document with the phrase drunk
driving law would have been indexed with SYN_FAMILY_01.

The approach worked pretty good on our project and we do not notice
any sides effects on the searches, it only removes matched documents
that were considered as noise of the synonyms mix issue.

I think this could be usefull to add this kind of approach on the solr
synoyms filter section of the wiki,

Cheers

Laurent


On Dec 2, 2007 3:41 PM, Otis Gospodnetic [EMAIL PROTECTED] wrote:
 Hi (changing to solr-user list)

 Yes it is, especially if the terms left of = are multi-spaced.  Check out 
 the Wiki, one page there explains this nicely.

 Otis
 -
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

 - Original Message 
 From: anuvenk [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Sent: Saturday, December 1, 2007 1:21:49 AM
 Subject: Re: synonyms


 Ideally, would it be a good idea to pass the index data through the
  synonyms
 filter while indexing?
 Also,
 say i have this mapping
 dui = drunk driving defense
  or
 dui,drunk driving defense,drunk driving law

 so matches for dui, will also bring up matches for drunk driving law
  (the
 whole phrase) or does it also bring up all matches for 'drunk' ,
 'driving','law'  ?



 Yonik Seeley wrote:
 
  On Nov 30, 2007 5:39 PM, anuvenk [EMAIL PROTECTED] wrote:
  Should data be re-indexed everytime synonyms like
  word1,word2
  or
  word1 = word2
 
  are added to synonyms.txt
 
  Yes, if it changes the index (if it's used in the index anaylzer as
  opposed to just the query analyzer).
 
  -Yonik
 
 

 --
 View this message in context:
  http://www.nabble.com/synonyms-tf4925232.html#a14100346
 Sent from the Solr - Dev mailing list archive at Nabble.com.







SOLR sorting - question

2007-12-04 Thread Kasi Sankaralingam
Do I need to select the fields in the query that I am trying to sort on?, for 
example if I want sort on update date then do I need to select that field?

Thanks,


Re: SOLR sorting - question

2007-12-04 Thread climbingrose
I don't think you have to. Just try the query on the REST interface and you
will know.

On Dec 5, 2007 9:56 AM, Kasi Sankaralingam [EMAIL PROTECTED] wrote:

 Do I need to select the fields in the query that I am trying to sort on?,
 for example if I want sort on update date then do I need to select that
 field?

 Thanks,




-- 
Regards,

Cuong Hoang


Re: SOLR sorting - question

2007-12-04 Thread Ryan McKinley

Kasi Sankaralingam wrote:

Do I need to select the fields in the query that I am trying to sort on?, for 
example if I want sort on update date then do I need to select that field?



I don't think so... are you getting an error?

I run queries like:
/select?q=*:*fl=namesort=added desc
without problem

ryan


solr + maven?

2007-12-04 Thread Ryan McKinley

Is anyone managing solr projects with maven?  I see:
https://issues.apache.org/jira/browse/SOLR-19
but that is 1 year old

If someone has a current pom.xml, can you post it on SOLR-19?

I just started messing with maven, so I don't really know what I am 
doing yet.


thanks
ryan


RE: SOLR sorting - question

2007-12-04 Thread Kasi Sankaralingam
Thanks a ton, that worked

-Original Message-
From: Ryan McKinley [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 04, 2007 3:08 PM
To: solr-user@lucene.apache.org
Subject: Re: SOLR sorting - question

Kasi Sankaralingam wrote:
 Do I need to select the fields in the query that I am trying to sort on?, for 
 example if I want sort on update date then do I need to select that field?


I don't think so... are you getting an error?

I run queries like:
/select?q=*:*fl=namesort=added desc
without problem

ryan


Re: LowerCaseFilterFactory and spellchecker

2007-12-04 Thread Chris Hostetter

: It does make some sense, but I'm not sure that it should be blindly analyzed
: without adding logic to handle certain cases (like the QueryParser does).
: What happens if the analyzer produces two tokens?  The spellchecker has to
: deal with this appropriately.  Spell checkers should be able to reverse
: analyze the suggestions as well, so Pyhton gets corrected to Python and
: not python.  Similarly, ad-hco should probably suggest ad-hoc and not
: adhoc.

These all seem like arguments in favor of using the query analyzer for the 
source field ... yes, the person making the schema has to think carefully 
about what the analyzer does,  but they already have to be equally carful 
about what the indexing analyzer does.

Bottom line: if the indexing analyzer is used to build the dictionary, the 
query anlyzer should be used before looking up enteries in the dictionary.

Python is only a good suggestion for Pyhton if searching for Python 
is going to return something. python might be a better suggestion.  
Likewise Python might be a good suggestion for python if it's always 
capitalized in the source field.

-Hoss



Re: Distribution without SSH?

2007-12-04 Thread Chris Hostetter

: I recently set up Solr with distribution on a couple of servers. I just
: learned that our network policies do not permit us to use SSH with
: passphraseless keys, and the snappuller script uses SSH to examine the master
: Solr instance's state before it pulls the newest index via rsync.

you may want to question/clarify this policy ... while it's generally a 
good idea to have a policy like this for *users* there's very little 
reason for it when you're dealing with role users ... accounts that 
exists solely to execute specific applications nad have limitited 
permissions.  if you have a solruser with a passphraseless key, which 
only works on the specific machines running solr, and solruser can only 
read/write the specific files it needs to for replication, there's very 
little downside.

: scripts, as required) to eliminate this dependency on SSH. I thought I ask the
: list in case anyone has experience with this same situation or any insights
: into the reasoning behind requiring SSH access to the master instance.

i haven't looked at those scripts in a while, but i believe it's two fold:
  1) get the name of hte most current snapshoot
  2) notify the master which snapshot is being used (for the status page)



-Hoss



Re: 1.2 commit script chokes on 1.2 response format

2007-12-04 Thread Chris Hostetter

: It's a trivial fix, and it seems like it's already been done in trunk:
: 
: 
http://svn.apache.org/viewvc/lucene/solr/trunk/src/scripts/commit?r1=543259r2=555612view=patch
: 
: The change has not been applied to 1.2. It might be nice if it were.

i'm not sure what you mean by applied to 1.2 ... releases are static: 
once published they are never changed.  in the event of serious bugs (ie: 
security holes or crash related bugs) then point releases may be published 
(ie solr-1.2.1) but most bugs don't warrant this.



-Hoss



Re: Tomcat6 env-entry

2007-12-04 Thread Chris Hostetter

: It works excellently in Tomcat 6. The toughest thing I had to deal with is
: discovering that the environment variable in web.xml for solr/home is
: essential. If you skip that step, it won't come up.

no, there's no reason why you should need to edit the web.xml file ... the 
solr/home property can be set in a Context configuration using an 
Environment directive without ever opening the solr.war.  See this 
section of the tomcat docs for me details...

http://tomcat.apache.org/tomcat-6.0-doc/config/context.html#Environment%20Entries

:env-entry
:env-entry-namesolr/home/env-entry-name
:env-entry-typejava.lang.String/env-entry-type
:env-entry-valueF:\Tomcat-6.0.14\webapps\solr/env-entry-value
:/env-entry


-Hoss



Re: Tomcat6 env-entry

2007-12-04 Thread Yousef Ourabi
Tomcat unpacks the jar into the webapps directory based off the context name 
anyway...

What was the original thinking behind not having solr/home set in the web.xml 
-- seems like an easier way to deal with this.

I would imagine most people are more familiar with setting params in web.xml 
than manually creating Contexts for their webapp...

In fact I would take a step further and have a default value of /opt/solr (or 
whatever...) and if a specific user wants to change it they can just edit their 
web.xml?

This would simplify the documentation, instead of configure your stuff in the 
Context -- it becomes this is the default, copy example/solr to /opt/solr (or 
we have a script do it) and deploy the .war


- Original Message -
From: Chris Hostetter [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Tuesday, December 4, 2007 6:34:55 PM (GMT-0800) America/Los_Angeles
Subject: Re: Tomcat6 env-entry


: It works excellently in Tomcat 6. The toughest thing I had to deal with is
: discovering that the environment variable in web.xml for solr/home is
: essential. If you skip that step, it won't come up.

no, there's no reason why you should need to edit the web.xml file ... the 
solr/home property can be set in a Context configuration using an 
Environment directive without ever opening the solr.war.  See this 
section of the tomcat docs for me details...

http://tomcat.apache.org/tomcat-6.0-doc/config/context.html#Environment%20Entries

:env-entry
:env-entry-namesolr/home/env-entry-name
:env-entry-typejava.lang.String/env-entry-type
:env-entry-valueF:\Tomcat-6.0.14\webapps\solr/env-entry-value
:/env-entry


-Hoss




single word Vs multiple word search

2007-12-04 Thread Dilip.TS

Hi,

Consider the scenario: I have indexed a document with a field1 having the
values as Test solr search  (having multiple  words)
And when i perform the keyword search as Test solr search i do get the
results,
whereas when i do the search for the Test, i dont get any results,
Any quick inputs would be of  great help...

Thanks in advance.


Regards,
Dilip TS
Starmark Services Pvt. Ltd.



RE: single word Vs multiple word search

2007-12-04 Thread Dilip.TS
Hi,

This is in continuation with my previous mail.

Iam using the SOLRInputDocument to perform the index operation.

So, my question if a field to be indexed contains multiple values,
then does the SOLRInputDocument performs the index for each word for that
field
or does it does for the set of words?

Thanks in advance.

Regards,
Dilip TS

-Original Message-
From: Dilip.TS [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 05, 2007 10:48 AM
To: SOLR
Subject: single word Vs multiple word search



Hi,

Consider the scenario: I have indexed a document with a field1 having the
values as Test solr search  (having multiple  words)
And when i perform the keyword search as Test solr search i do get the
results,
whereas when i do the search for the Test, i dont get any results,
Any quick inputs would be of  great help...

Thanks in advance.


Regards,
Dilip TS
Starmark Services Pvt. Ltd.