Is this solr 1.2 a final version?

2007-06-07 Thread Thierry Collogne
Hello, I was just downloading solr and noticed that there is a 1.2 version available. Is this the final 1.2 version? Is this the version that is to be used? Thank you, Thierry

how to crawl when Solr is search engine?

2007-06-07 Thread Manoharam Reddy
I have just begun using Solr. I see that we have to insert documents by posting XMLs to solr/update I would like to know how Solr is used as a search engine in enterprises. How do you do the crawling of your intranet and passing the information as XML to solr/update. Isn't this going to be slow?

Re: how to crawl when Solr is search engine?

2007-06-07 Thread Bertrand Delacretaz
On 6/7/07, Ian Holsman [EMAIL PROTECTED] wrote: . it's called XSLT. most modern browsers can do the transform on the client side. otherwise there is some server side tools (cocoon I think does this) to do the transform on the server before sending it out Solr also does server-side XSLT,

Logging errors from multiple solr instances

2007-06-07 Thread Walter Lewis
I'm running solr 1.1 under Tomcat 5.5. On the development machine there are a modest number of instances of solr indexes (six). In the logs currently the only way to distinguish them is to compare the [EMAIL PROTECTED], where the someIdentifier changes each time Tomcat is restarted

host logging options (was Re: Schema validator/debugger)

2007-06-07 Thread Walter Lewis
Andrew Nagy wrote: Yonik Seeley wrote: I dropped your schema.xml directly into the Solr example (using Jetty), fired it up, and everything works fine!? Okay, I switched over to Jetty and now I get a different error: SEVERE: org.apache.solr.core.SolrException: undefined field text As someone

Re: Wildcards / Binary searches

2007-06-07 Thread Frédéric Glorieux
Sorry to jump on a Side note of the thread, but the topic is about some of my need of the moment. Side Note: It's my opinion that type ahead or auto complete' style functionality is best addressed by customized logic (most likely using specially built fields containing all of the prefixes

Solr 1.2 released

2007-06-07 Thread Yonik Seeley
Solr 1.2 is now available for download! This is the first release since Solr graduated from the Incubator, and includes many improvements, including CSV/delimited-text data loading, time based auto-commit, faster faceting, negative filters, a spell-check handler, sounds-like word filters, regex

Re: Is this solr 1.2 a final version?

2007-06-07 Thread Yonik Seeley
On 6/7/07, Thierry Collogne [EMAIL PROTECTED] wrote: I was just downloading solr and noticed that there is a 1.2 version available. Is this the final 1.2 version? Is this the version that is to be used? Yes. A release is typically available a day before an announcement because it takes a

RE: highlight and wildcards ?

2007-06-07 Thread Xuesong Luo
Frédéric, I asked a similar question several days before, it seems we don't have a perfect solution when using prefix wildcard with highlight. Here is what Chris said: in Solr 1.1, highlighting used the info from the raw query to do highlighting, hence in your query for consult* it would

Re: highlight and wildcards ?

2007-06-07 Thread Frédéric Glorieux
Xuesong (?), Thanks a lot for your answer, sorry to have not scan the archives before. This a really good and understandable reason, but sad for my project. Prefix queries will be the main activities of my users (they need to search latin texts, so that domin* is enough to match dominus or

RE: highlight and wildcards ?

2007-06-07 Thread Xuesong Luo
Same in my project. Chris does mention we can put a ? before the *, so instead of domin*, you can use domin?*, however that requires at least one char following your search string. -Original Message- From: Frédéric Glorieux [mailto:[EMAIL PROTECTED] Sent: Thursday, June 07, 2007 10:37

Multi-language indexing and searching

2007-06-07 Thread Daniel Alheiros
Hi, I'm just starting to use Solr and so far, it has been a very interesting learning process. I wasn't a Lucene user, so I'm learning a lot about both. My problem is: I have to index and search content in several languages. My scenario is a bit different from other that I've already read in

filter query speed

2007-06-07 Thread Michael Thessel
Hello UG, I've got a problem with filtered queries. I have an index with about 8 million documents. I save a timestamp (not the time of indexing) for each document as an integer field. Querying the index is pretty fast. But when I filter on the timestamp the queries are extremely slow, even if

Re: highlight and wildcards ?

2007-06-07 Thread Frédéric Glorieux
Same in my project. Chris does mention we can put a ? before the *, so instead of domin*, you can use domin?*, however that requires at least one char following your search string. Right, it works well, and one char is a detail. With a?* I get the documented lucene error maxClauseCount is

Re: how to crawl when Solr is search engine?

2007-06-07 Thread Mike Klaas
On 7-Jun-07, at 1:04 AM, Manoharam Reddy wrote: Some musing:- (I have used Nutch before and one thing I observed there was that if I delete the crawl folder when Nutch is running, users can still search and obtain proper results. It seems Nutch caches all the indexes in the memory when it

Re: Logging errors from multiple solr instances

2007-06-07 Thread Chris Hostetter
: Is this addressed in 1.2 or is running multiple instances of indexes : such a Bad Idea that supporting this would be leading a fool further astray? I still haven't had a chance to try it myself using Tomcat, but here's what i found the last time someone asked about this...

TextField case sensitivity

2007-06-07 Thread Xuesong Luo
I run a problem when searching on a TextField. When I pass q=William or q=WILLiam, solr is able to find records whose default search field value is William, however if I pass q=WilliAm, solr did not return any thing. I searched on the archive, Yonik mentioned the lowercasefilterfactory doesn't

Re: TextField case sensitivity

2007-06-07 Thread Yonik Seeley
On 6/7/07, Xuesong Luo [EMAIL PROTECTED] wrote: I run a problem when searching on a TextField. When I pass q=William or q=WILLiam, solr is able to find records whose default search field value is William, however if I pass q=WilliAm, solr did not return any thing. Sounds like

Re: highlight and wildcards ?

2007-06-07 Thread Chris Hostetter
: With a?* I get the documented lucene error : maxClauseCount is set to 1024 Which is why Solr converts PrefixQueries to ConstantScorePrefixQueries that don't have that problem --the trade off being that they can't be highlighted, and we're right back where we started. It's a question of

Re: TextField case sensitivity

2007-06-07 Thread Ryan McKinley
have you taken a look the output from the admin/analysis? http://localhost:8983/solr/admin/analysis.jsp?highlight=on This lets you see what tokens are generated for index/query. From your description, I'm suspicious that the generated tokens are actually: willi am Also, if you want the same

Re: filter query speed

2007-06-07 Thread Yonik Seeley
On 6/7/07, Michael Thessel [EMAIL PROTECTED] wrote: I've got a problem with filtered queries. I have an index with about 8 million documents. I save a timestamp (not the time of indexing) for each document as an integer field. Querying the index is pretty fast. But when I filter on the timestamp

Re: solr+hadoop = next solr

2007-06-07 Thread Mike Klaas
On 6-Jun-07, at 7:44 PM, Jeff Rodenburg wrote: I've been exploring distributed search, as of late. I don't know about the next solr but I could certainly see a distributed solr grow out of such an expansion. I've implemented a highly-distributed search engine using Solr (200m docs and

Re: filter query speed

2007-06-07 Thread Michael Thessel
Hey Yoink, thanks a lot for your quick reply. I suspect that the endpoint to your dateline filter changes often, hence caching is doing no good. Is then endpoint (1181237598) derived from the current time? Yes, it is. If so, there are some things you can do: 1) make it faster to generate

Re: filter query speed

2007-06-07 Thread Yonik Seeley
On 6/7/07, Michael Thessel [EMAIL PROTECTED] wrote: Is there a general speed problem with range searches in solr? It looks a bit strange for me, that a query for a term takes 5 ms while adding a filter to the same resultset takes 80s? It's completely dependent on the number of terms in the

RE: TextField case sensitivity

2007-06-07 Thread Xuesong Luo
I have WordDelimiterFilter defined in the schema, I didn't include it in my original email because I thought it doesn't matter. It seems it matters. Looks like WilliAm is treated as two words. That's why it didn't find a match. Thanks Xuesong -Original Message- From: [EMAIL PROTECTED]

RE: TextField case sensitivity

2007-06-07 Thread Xuesong Luo
Ryan, you are right, that's the problem. WilliAM is treated as two words by the WordDelimiterFilterFactory. Thanks Xuesong -Original Message- From: Ryan McKinley [mailto:[EMAIL PROTECTED] Sent: Thursday, June 07, 2007 11:30 AM To: solr-user@lucene.apache.org Subject: Re: TextField case

Re: solr+hadoop = next solr

2007-06-07 Thread Jeff Rodenburg
Mike - thanks for the comments. Some responses added below. On 6/7/07, Mike Klaas [EMAIL PROTECTED] wrote: I've implemented a highly-distributed search engine using Solr (200m docs and growing, 60+ servers). It is not a Solr-based solution in the vein of FederatedSearch--it is a

DisMax request handler doesn't work with stopwords?

2007-06-07 Thread Casey Durfee
It appears that if your search terms include stopwords and you use the DisMax request handler, you get no results whereas the same search with the standard request handler does give you results. Is this a bug or by design? Thanks, --Casey

Re: TextField case sensitivity

2007-06-07 Thread Mike Klaas
On 7-Jun-07, at 1:04 PM, Xuesong Luo wrote: Ryan, you are right, that's the problem. WilliAM is treated as two words by the WordDelimiterFilterFactory. I have found this behaviour a little too aggresive for my needs, so i added an option to disable it. Patch is here:

Re: solr+hadoop = next solr

2007-06-07 Thread Rafael Rossini
Hi, Jeff and Mike. Would you mind telling us about the architecture of your solutions a little bit? Mike, you said that you implemented a highly-distributed search engine using Solr as indexing nodes. What does that mean? You guys implemented a master, multi-slave solution for replication? Or

Re: DisMax request handler doesn't work with stopwords?

2007-06-07 Thread Chris Hostetter
: It appears that if your search terms include stopwords and you use the : DisMax request handler, you get no results whereas the same search with : the standard request handler does give you results. Is this a bug or by : design? dismax works just fine with stop words ... can you give a

Re: DisMax request handler doesn't work with stopwords?

2007-06-07 Thread Casey Durfee
Sure thing. I downloaded the latest version of Solr, started up the example server, and indexed the ipod_other.xml file. The following URLs give a result: http://localhost:8983/solr/select/?q=ipod http://localhost:8983/solr/select/?q=the+ipod

Re: DisMax request handler doesn't work with stopwords?

2007-06-07 Thread Mike Klaas
On 7-Jun-07, at 1:41 PM, Casey Durfee wrote: It appears that if your search terms include stopwords and you use the DisMax request handler, you get no results whereas the same search with the standard request handler does give you results. Is this a bug or by design? There is a subtlety

Re: DisMax request handler doesn't work with stopwords?

2007-06-07 Thread Casey Durfee
Thank you! That makes sense. --Casey Mike Klaas [EMAIL PROTECTED] 6/7/2007 2:35 PM On 7-Jun-07, at 1:41 PM, Casey Durfee wrote: It appears that if your search terms include stopwords and you use the DisMax request handler, you get no results whereas the same search with the standard

Re: highlight and wildcards ?

2007-06-07 Thread Frédéric Glorieux
Hoss, Thanks for all your information and pointers. I know that my problems are not mainstream. ConstantScoreQuery @author yonik public void extractTerms(Set terms) { // OK to not add any terms when used for MultiSearcher, // but may not be OK for highlighting }

Re: highlight and wildcards ?

2007-06-07 Thread Mike Klaas
On 7-Jun-07, at 5:27 PM, Frédéric Glorieux wrote: Hoss, Thanks for all your information and pointers. I know that my problems are not mainstream. Have you tried commenting out getPrefixQuery in solr.search.SolrQueryParser? It should then revert to a regular lucene prefix query.