RE: wildcard newbie question

2008-01-31 Thread Ard Schrijvers
I have a text field type called courseTitle and it contains Struts 2 If I search courseTitle:strut* I get the documents but if I search with courseTitle:struts* I do not get any results. Could you please explain why? Just a guess: It might be because of stemming. Do you

Factory in Solr

2008-01-31 Thread Heba Farouk
Hello there I'm trying to add a factory in solr for tokenizing Arabic text, but I receive some error (the one at the last of my email) Here is my code: package org.apache.solr.analysis; import gpl.pierrick.brihaye.aramorph.lucene.ArabicTokenizer; import java.io.Reader; import

Re: commit/ doesn't work

2008-01-31 Thread Yonik Seeley
On Jan 31, 2008 8:20 AM, shenzhuxi [EMAIL PROTECTED] wrote: curl %solr_home% --data-binary commit/ -H Content-type:text/xml; charset=utf-8 It doesm't work to update. I have to restart solr to make update work. Do I need to use: curl %solr_home% --data-binary commit waitFlush=false

exact matches not possible?

2008-01-31 Thread Jörg Kiegeland
Normally I do substring-queries on my field named X. Now however I also require exact-match queries, however I do not know how to do this! If I do X:blabla or X:blabla .. all documents containing blabla in field X are returned. However these are much too much, since I know there is

Re: commit/ doesn't work

2008-01-31 Thread shenzhuxi
Yonik Seeley wrote: On Jan 31, 2008 8:20 AM, shenzhuxi [EMAIL PROTECTED] wrote: curl %solr_home% --data-binary commit/ -H Content-type:text/xml; charset=utf-8 It doesm't work to update. I have to restart solr to make update work. Do I need to use: curl %solr_home% --data-binary

Re: exact matches not possible?

2008-01-31 Thread Shalin Shekhar Mangar
I guess you can try specifying your search as a filter query e.g. q=blablafq=X:blabla, which will give back only the exact match. On Jan 31, 2008 7:23 PM, Jörg Kiegeland [EMAIL PROTECTED] wrote: Normally I do substring-queries on my field named X. Now however I also require exact-match

Re: exact matches not possible?

2008-01-31 Thread Jörg Kiegeland
I guess you can try specifying your search as a filter query e.g. q=blablafq=X:blabla, which will give back only the exact match. I tried this syntax in may Firefox URL field, however seems not to help. How do I specify a filter query with Solrj (i.e. using SolrQuery)?

Re: exact matches not possible?

2008-01-31 Thread Andy Blower
Disclaimer: I've only been working (evaluating) Solr for three weeks. I had exactly this issue, and I found that using a field of type string gave exact matches. So, if you need to do both substring and exact match queries, you'll need two fields. One non-tokenized field using class StrField and

Re: exact matches not possible?

2008-01-31 Thread Yonik Seeley
On Jan 31, 2008 10:26 AM, Jörg Kiegeland [EMAIL PROTECTED] wrote: I guess you can try specifying your search as a filter query e.g. q=blablafq=X:blabla, which will give back only the exact match. I tried this syntax in may Firefox URL field, however seems not to help. How do I specify a

Re: exact matches not possible?

2008-01-31 Thread Shalin Shekhar Mangar
Sorry, this method will not work with tokenized fields I guess. Andy's approach is the standard in this case, however Yonik's method should also work. As for specifying filter queries with SolrJ, use SolrQuery.addFilterQuery(String filterQuery) to specify filter queries in code. On Jan 31, 2008

Slow response times using *:*

2008-01-31 Thread Andy Blower
I'm evaluating SOLR/Lucene for our needs and currently looking at performance since 99% of the functionality we're looking for is provided. The index contains 18.4 Million records and is 58Gb in size. Most queries are acceptably quick, once the filters are cached. The filters select one or more

Re: Slow response times using *:*

2008-01-31 Thread Yonik Seeley
On Jan 31, 2008 10:43 AM, Andy Blower [EMAIL PROTECTED] wrote: I'm evaluating SOLR/Lucene for our needs and currently looking at performance since 99% of the functionality we're looking for is provided. The index contains 18.4 Million records and is 58Gb in size. Most queries are acceptably

Re: Slow response times using *:*

2008-01-31 Thread Shalin Shekhar Mangar
I can't give you a definitive answer based on the data you've provided. However, do you really need to get *all* facets? Can't you limit them with facet.limit field? Are you planning to run multiple *:* queries with all facets turned on a 58GB index in a live system? I don't think that's a good

Re: Slow response times using *:*

2008-01-31 Thread Andy Blower
Actually I do need all facets for a field, although I've just realised that the tests are limited to only 100. Ooops. So it should be worse in reality... erk. Since that's what we do with our current search engine, Solr has to be able to compete with this. The fields are a mix of non-multi,

Re: Slow response times using *:*

2008-01-31 Thread Walter Underwood
How often does the index change? Can you use an HTTP cache and do this once for each new index? wunder On 1/31/08 9:09 AM, Andy Blower [EMAIL PROTECTED] wrote: Actually I do need all facets for a field, although I've just realised that the tests are limited to only 100. Ooops. So it should

Re: Slow response times using *:*

2008-01-31 Thread Andy Blower
Yonik Seeley wrote: *:* maps to MatchAllDocsQuery, which for each document needs to check if it's deleted (that's a synchronized call, and can be a bottleneck). Why does this need to check if documents are deleted if normal queries don't? Is there any way of disabling this since I can be

RE: SEVERE: java.lang.OutOfMemoryError: Java heap space

2008-01-31 Thread Alex Benjamen
Thanks to all who responded. Things are running well! The IBM version of the JRE for Intel 64 seems to run good, and the stalling issue has dissappeared. (when the solr instance stops responding and freezes up) What I learned is that solr is a great product but needs tuning to fit the usage.

How does remote streaming works for xml files?

2008-01-31 Thread Leonardo Santagada
With most of the default solrconfig.xml and setting: requestDispatcher handleSelect=true !--Make sure your system has some authentication before enabling remote streaming! -- requestParsers enableRemoteStreaming=true multipartUploadLimitInKB=2048 / /requestDispatcher I think

Re: wildcard newbie question

2008-01-31 Thread Mike Klaas
On 30-Jan-08, at 3:31 PM, Alessandro Senserini wrote: I have a text field type called courseTitle and it contains Struts 2 If I search courseTitle:strut* I get the documents but if I search with courseTitle:struts* I do not get any results. Could you please explain why? Wildcard

Re: Slow response times using *:*

2008-01-31 Thread Mike Klaas
On 31-Jan-08, at 9:41 AM, Andy Blower wrote: Yonik Seeley wrote: This surprises me because the filter query submitted has usually already been submitted along with a normal query, and so should be cached in the filter cache. Surely all solr needs to do is return a handful of fields for

Re: How does remote streaming works for xml files?

2008-01-31 Thread Ryan McKinley
Jan 31, 2008 9:39:01 PM org.apache.solr.core.SolrCore execute INFO: /update stream.filename=/tmp/commited_1201822625MainThread0_add_file.xml 0 0 isn't stream.file the parameter name? ryan