Re: multilingual list of stopwords

2007-10-24 Thread Daniel Alheiros
If you do want more stopwords sources, there is this one too: http://snowball.tartarus.org/algorithms/ And I would go for the language identification and then I would apply the proper set. Cheers, Daniel On 18/10/07 16:18, Maria Mosolova [EMAIL PROTECTED] wrote: Thanks a lot Peter! Maria

Re: Payloads for multiValued fields?

2007-10-24 Thread Alf Eaton
Yonik Seeley wrote: On 8/16/07, Alf Eaton [EMAIL PROTECTED] wrote: On 16 Aug 2007, at 17:20, Alf Eaton wrote: When searching a multiValued field, is it possible to know which of the multiple fields the match was in? For example if I have an index of documents, each of which has multiple

Re: Payloads for multiValued fields?

2007-10-24 Thread Yonik Seeley
On 10/24/07, Alf Eaton [EMAIL PROTECTED] wrote: Yonik Seeley wrote: Could you perhaps index the captions as #1 this is the first caption #2 this is the second caption And then when just look for #n in the highlighted results? For display, you could also strip out the #n in the

Re: Payloads for multiValued fields?

2007-10-24 Thread Alf Eaton
Yonik Seeley wrote: On 10/24/07, Alf Eaton [EMAIL PROTECTED] wrote: Yonik Seeley wrote: Could you perhaps index the captions as #1 this is the first caption #2 this is the second caption And then when just look for #n in the highlighted results? For display, you could also strip out the #n

Re: Forced Top Document

2007-10-24 Thread Matthew Runo
I'd love to know this, as I just got a development request for this very feature. I'd rather not spend time on it if it already exists. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833

Solr and security

2007-10-24 Thread Cool Coder
Hi Group, As far as I know, to use solr, we need to deploy it as a server and communicate to solr using http protocol. How about its security? i.e. how can we ensure that it only accepts request from predefined set of users only. Is there any way we can specify this in solr or

Empty field error when boosting a dismax query using bf

2007-10-24 Thread Alf Eaton
I'm trying to use the bf parameter to boost a dismax query based on the value of a certain (integer) field. The trouble is that for some of the documents this field is empty (rather than zero), which means that there's an error when using the bf parameter: -

RE: Solr and security

2007-10-24 Thread Wagner,Harry
One effective method is to block access to the port Solr runs on. Force application access to come thru the HTTP server, and let it map to the application server (i.e., like mod_jk does for for Apache Tomcat). Simple, but effective. Cheers! harry -Original Message- From: Cool Coder

Re: Empty field error when boosting a dismax query using bf

2007-10-24 Thread Yonik Seeley
On 10/24/07, Alf Eaton [EMAIL PROTECTED] wrote: I'm trying to use the bf parameter to boost a dismax query based on the value of a certain (integer) field. The trouble is that for some of the documents this field is empty (rather than zero), which means that there's an error when using the

RE: Solr and security

2007-10-24 Thread Norskog, Lance
Solr does not do security itself. Servlet containers usually support various security options: account/password through HTTP authentication (very weak security) and certificates (very strong security) are what I would look at first. Lance -Original Message- From: Wagner,Harry

RE: Forced Top Document

2007-10-24 Thread Charlie Jackson
Do you know which document you want at the top? If so, I believe you could just add an OR clause to your query to boost that document very high, such as ?q=foo OR id:bar^1000 Tried this on my installation and it did, indeed push the document specified to the top. -Original Message-

RE: Forced Top Document

2007-10-24 Thread Charlie Jackson
Yes, this will only work if the results are sorted by score (the default). One thing I thought of after I sent this out was that this will include the specified document even if it doesn't match your search criteria, which may not be what you want. -Original Message- From: mark

RE: Forced Top Document

2007-10-24 Thread Daniel Pitts
I'm going to be doing something similar, and I don't think I'll be sorting by score (although, that might be feasible). In my use-case though, we don't want to include something unless it is already matched by our filters. I'll probably end up just making two search hits, but it would be nice if

Re: Forced Top Document

2007-10-24 Thread Kyle Banerjee
This method Charlie suggested will work just fine with a minor tweak. For relevancy sorting ?q=foo OR (foo AND id:bar) For nonrelevancy sorting, all you need is a multilevel sort. Just add a bogus field that only the important document contains. Then sort by bogus field in descending order

Re: Forced Top Document

2007-10-24 Thread Mike Klaas
On 24-Oct-07, at 10:56 AM, Charlie Jackson wrote: Yes, this will only work if the results are sorted by score (the default). One thing I thought of after I sent this out was that this will include the specified document even if it doesn't match your search criteria, which may not be what

Re: Payloads for multiValued fields?

2007-10-24 Thread Mike Klaas
On 24-Oct-07, at 7:10 AM, Alf Eaton wrote: Yes, I was just trying that this morning and it's an improvement, though not ideal if the field contains a lot of text (in other words it's still a suboptimal workaround). I do think it might be useful for the response to contain an element

RE: Forced Top Document

2007-10-24 Thread Charlie Jackson
Took the words right out my mouth! That second method would be particularly effective but will only work if you can identify these docs at index time. -Original Message- From: Kyle Banerjee [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 24, 2007 1:31 PM To:

Re: Tagging in solr

2007-10-24 Thread ARakesh
Hi, This is what I intended to do with my impl. Create a Multi-Valued filed for the schema. Then when you prepare the document which you want to send it to the SOLR to index, for examples doc productID1234/productID ... ... tagJava/tag /tag.Net/tag /doc And the next question of

Re: Payloads for multiValued fields?

2007-10-24 Thread Alf Eaton
Mike Klaas wrote: On 24-Oct-07, at 7:10 AM, Alf Eaton wrote: Yes, I was just trying that this morning and it's an improvement, though not ideal if the field contains a lot of text (in other words it's still a suboptimal workaround). I do think it might be useful for the response to contain

Re: Forced Top Document

2007-10-24 Thread Bill Fowler
The typical use case, though, is for the featured document to be on top only for certain queries. Like in an intranet where someone queries 401K or retirement or similar, you want to feature a document about benefits that would otherwise rank really low for that query. I have not be able to make

Re: Forced Top Document

2007-10-24 Thread Kyle Banerjee
The typical use case, though, is for the featured document to be on top only for certain queries. Like in an intranet where someone queries 401K or retirement or similar, you want to feature a document about benefits that would otherwise rank really low for that query. I have not be able to

Re: Forced Top Document

2007-10-24 Thread mark angelillo
That's the ticket exactly, Kyle. What I have is the ID of my document, so I indexed a dynamic field with name id_*. Then I just set that field for each document with the proper ID. So for example, to pop one document to the top of the index, i just run: q=field: value; id_700390+desc,

AW: Converting German special characters / umlaute

2007-10-24 Thread Matthias Eireiner
Dear list, it has been some time, but here is what I did. I had a look at Thomas Traeger's tip to use the SnowballPorterFilterFactory, which does not actually do the job. Its purpose is to convert regular ASCII into special characters. And I want it the other way, such that all special

DisMax and REQUIRED OR REQUIRED query rewrite

2007-10-24 Thread Otis Gospodnetic
Hi, I'm using DisMaxRequestHandler and trying to transform a user-entered multi-term boolean query (e.g. foo bar baz). I've configured DisMaxRH like so: str name=qf name^1.2 -- I removed the manu field from here -- /str str name=pf manu^1.4

RE: Converting German special characters / umlaute

2007-10-24 Thread Norskog, Lance
Isn't this what ISOLatin1Filter does? Turn Björk into Bjork? This should be much faster than PatternReplaceFilterFactory. -Original Message- From: Matthias Eireiner [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 24, 2007 1:47 PM To: solr-user@lucene.apache.org Subject: AW:

SpellCheckerRequestHandler and onlyMorePopular

2007-10-24 Thread Justin Knoll
I'm running the example Solr install with a custom schema.xml and solrconfig.xml. I'm seeing some unexpected results for searches using the SpellCheckerRequestHandler using the onlyMorePopular option. Namely, searching for certain terms with onlyMorePopular set to true returns a suggestion

Re: SpellCheckerRequestHandler and onlyMorePopular

2007-10-24 Thread Dave Lewis
Are you running off of a release? onlyMorePopular was only implemented in the trunk a few days ago (in earlier versions, even if you specified onlyMorePopular, it was ignored). dave On Oct 24, 2007, at 5:58 PM, Justin Knoll wrote: I'm running the example Solr install with a custom

where did my foreign language go?

2007-10-24 Thread Ian Holsman
Hi. I'm in the middle of bringing up a new solr server and am using the trunk. (where I was using an earlier nightly release of about 2-3 weeks ago on my old server) now, when I do a search for 日本 (japan) it used to show the kanji in the q area, but now it shows gibberish instead 日本

Re: where did my foreign language go?

2007-10-24 Thread Yonik Seeley
On 10/24/07, Ian Holsman [EMAIL PROTECTED] wrote: Hi. I'm in the middle of bringing up a new solr server and am using the trunk. (where I was using an earlier nightly release of about 2-3 weeks ago on my old server) now, when I do a search for 日本 (japan) it used to show the kanji in the q

Re: where did my foreign language go?

2007-10-24 Thread sunrise1984
Maybe the following is useful for you.(It comes from http://wiki.apache.org/solr/SolrTomcat) If you are going to query Solr using international characters (127) using HTTP-GET, you must configure Tomcat to conform to the URI standard by accepting percent-encoded UTF-8. Edit Tomcat's

New issue: request for limit parameter for search time, hits, and estimated ram usage

2007-10-24 Thread Norskog, Lance
http://issues.apache.org/jira/browse/SOLR-392 Summary: It would be good for end-user applications if Solr allowed searches to cease before finishing, and still return partial results.

Re: where did my foreign language go?

2007-10-24 Thread Ian Holsman
Thanks.. I'll do that sunrise1984 wrote: Maybe the following is useful for you.(It comes from http://wiki.apache.org/solr/SolrTomcat) If you are going to query Solr using international characters (127) using HTTP-GET, you must configure Tomcat to conform to the URI standard by accepting

Re: My filters are not used

2007-10-24 Thread Yonik Seeley
On 10/24/07, Norskog, Lance [EMAIL PROTECTED] wrote: I am creating a filter that is never used. Here is the query sequence: q=*:*fq=contentid:00*start=0rows=200 q=*:*fq=contentid:00*start=200rows=200 q=*:*fq=contentid:00*start=400rows=200 q=*:*fq=contentid:00*start=600rows=200

Re: extending StandardRequestHandler gives ClassCastException

2007-10-24 Thread Doug Daniels
Don't know if you ever found a fix for this issue, but I saw experienced it tonight while trying to run solr through jetty in eclipse. The custom RequestHandler plugin was loading fine when running jetty normally from the command-line, but running it through eclipse hit the ClassCastException.