A question about solr score

2007-10-25 Thread zx zhang
Hi, everyone! As we known, solr uses lucene scoring. This score is the raw score. Scores returned from Hits aren't necessarily the raw score, however. If the top-scoring document scores greater than 1.0, all scores are normalized from that score, such that all scores from Hits are uaranteed to be 1

phrase query performance

2007-10-25 Thread Haishan Chen
I am a new Solr user and wonder if anyone can help me these questions. I used Solr to index about two million documents and query on it using standard request handler. I disabled all cache. I found phrase query was substantially slower than the usual query. The statistic I collected is as follo

RE: extending StandardRequestHandler gives ClassCastException

2007-10-25 Thread Haishan Chen
Hi Hoss, I am sorry about that. I know it was not very polite to do so. I was new to the community and new to mailing list. I was experimenting how to start a discussion. I tried starting the discussion by sending a new email to [EMAIL PROTECTED] and [EMAIL PROTECTED] But it doesn't seem to

Re: Solr Index update - specific field only

2007-10-25 Thread Chris Hostetter
there is some work in progress on this, but it isn't ready for prime time yet ... you are welcome to be an early adopter and try out some of the patches... https://issues.apache.org/jira/browse/SOLR-139 -Hoss

Re: SOLR 1.3 Release?

2007-10-25 Thread Pieter Berkel
On 26/10/2007, James liu <[EMAIL PROTECTED]> wrote: > > where i can read 1.3 new features? > Take a look at CHANGES.txt in the root directory of svn trunk, or also here: http://svn.apache.org/viewvc/lucene/solr/trunk/CHANGES.txt Piete

Re: SOLR 1.3 Release?

2007-10-25 Thread James liu
where i can read 1.3 new features? 2007/10/26, Venkatraman S <[EMAIL PROTECTED]>: > > On 10/26/07, Mike Klaas <[EMAIL PROTECTED]> wrote: > > > > If we did a 1.2.x, it shoud (imo) contain no new features, only > > important bugfixes. > > > I have been having a look at the trunk for quite sometime n

Solr Index update - specific field only

2007-10-25 Thread Jae Joo
Hi, I have index which has the field NOT stored and would like update some field which is indexed and stored. Updating index requires all fields same as original (before updating) with updated field. Is there any way to post "JUST UPDATED FIELD ONLY"? Here is an example. field indexed stored -

Re: SOLR 1.3 Release?

2007-10-25 Thread Venkatraman S
On 10/26/07, Mike Klaas <[EMAIL PROTECTED]> wrote: > > If we did a 1.2.x, it shoud (imo) contain no new features, only > important bugfixes. I have been having a look at the trunk for quite sometime now, and must say that its changing pretty fast. Having an interim release now will require more

Re: SOLR 1.3 Release?

2007-10-25 Thread Mike Klaas
On 25-Oct-07, at 4:50 PM, patrick o'leary wrote: It might be good though to have an interim release say 1.2.x Which would simply allow patches, and valuable contributions to get added to the trunk. I'm not sure what you mean. The "problem" is that there lots of valuable contributions in

Re: Solr and security

2007-10-25 Thread Nick Jenkin
You have to remember that Solr is search, not security, its not considered a great idea to have it publicly accessible. If you want a public instance any requests to your solr instance should be "proxied" by some interface between solr and the user. e.g. user requests http://foobar.com/searchapi?k

Re: SOLR 1.3 Release?

2007-10-25 Thread patrick o'leary
It might be good though to have an interim release say 1.2.x Which would simply allow patches, and valuable contributions to get added to the trunk. Right now, there are a few items which are falling behind because the trunk code is changing rapidly. A 1.2.x release will give you the opportun

Re: SOLR 1.3 Release?

2007-10-25 Thread Matthew Runo
I'm mostly interested in using the SOLRj library for now, and the spellsheck handler & work on per-field updates. I think I'll just go with 1.3 and report back if something seems broken. ++ | Matthew Runo | Zappos Development | [EMAIL

Re: Payloads for multiValued fields?

2007-10-25 Thread Mike Klaas
On 24-Oct-07, at 12:39 PM, Alf Eaton wrote: Mike Klaas wrote: On 24-Oct-07, at 7:10 AM, Alf Eaton wrote: Yes, I was just trying that this morning and it's an improvement, though not ideal if the field contains a lot of text (in other words it's still a suboptimal workaround). I do think i

sorting on dynamic fields - good, bad, neither?

2007-10-25 Thread Charles Hornberger
Hi -- I'm building a Solr index to replace an existing RDBMS-based system, and I have one requirement that I'm not sure how to best satisfy. Documents in our collection can have user-generated ratings associated with them; these user-generated ratings are aggregated by source (sources are basicall

Re: Performance Recommendation

2007-10-25 Thread Chris Hostetter
: Subject: Performance Recommendation : In-Reply-To: <[EMAIL PROTECTED]> http://people.apache.org/~hossman/#threadhijack When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject line of your email

Re: SOLR 1.3 Release?

2007-10-25 Thread Ryan McKinley
Yonik Seeley wrote: On 10/25/07, Matthew Runo <[EMAIL PROTECTED]> wrote: Any ideas on when 1.3 might be released? We're starting a new project and I'd love to use 1.3 for it - is SVN head stable enough for use? I think it's stable in the sense of "does the right thing and doesn't crash", but I

RE: extending StandardRequestHandler gives ClassCastException

2007-10-25 Thread Chris Hostetter
: Subject: RE: extending StandardRequestHandler gives ClassCastException http://people.apache.org/~hossman/#threadhijack When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject line of your email

Re: indexing one documents with different populated fields causes a deletion of documents in with other populated fileds

2007-10-25 Thread Yonik Seeley
On 10/25/07, Anton Valdstein <[EMAIL PROTECTED]> wrote: > thanks, that explains a lot (:, > I have another question: about how the idf is calculated: > is the document frequency the sum of all documents containing the term in > one of their fields or just in the field the query contained? idfs are

Re: field name synonyms

2007-10-25 Thread Maria Mosolova
Thanks Otis. Yes, I can change the application, just hoped that there might be a better way to handle the situation ... Maria Otis Gospodnetic wrote: I suppose you could use copyField functionality, though that feels like an overkill. Why not just have a map with field name aliases in your a

Re: SOLR 1.3 Release?

2007-10-25 Thread Yonik Seeley
On 10/25/07, Matthew Runo <[EMAIL PROTECTED]> wrote: > Any ideas on when 1.3 might be released? We're starting a new project > and I'd love to use 1.3 for it - is SVN head stable enough for use? I think it's stable in the sense of "does the right thing and doesn't crash", but IMO isn't stable in t

Re: Performance Recommendation

2007-10-25 Thread Erik Hatcher
On Oct 25, 2007, at 4:19 PM, Wagner,Harry wrote: Where is a good place to look for some performance recommendations? We have a 2.4G index running on server with 16G. Overall performance is very good, but the initial sort on an index is too slow. Any idea what, if anything, in the solrConfig w

SOLR 1.3 Release?

2007-10-25 Thread Matthew Runo
Any ideas on when 1.3 might be released? We're starting a new project and I'd love to use 1.3 for it - is SVN head stable enough for use? ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 +--

Re: indexing one documents with different populated fields causes a deletion of documents in with other populated fileds

2007-10-25 Thread Anton Valdstein
thanks, that explains a lot (:, I have another question: about how the idf is calculated: is the document frequency the sum of all documents containing the term in one of their fields or just in the field the query contained? On 10/25/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > On 10/25/07, A

Performance Recommendation

2007-10-25 Thread Wagner,Harry
Where is a good place to look for some performance recommendations? We have a 2.4G index running on server with 16G. Overall performance is very good, but the initial sort on an index is too slow. Any idea what, if anything, in the solrConfig would help that? Thanks... harry

Re: field name synonyms

2007-10-25 Thread Otis Gospodnetic
I suppose you could use copyField functionality, though that feels like an overkill. Why not just have a map with field name aliases in your app that rewrites the field names before sending the query to Solr? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Mess

field name synonyms

2007-10-25 Thread Maria Mosolova
Hello, I am trying to figure out whether there is a way to specify field names synonyms in Solr/Lucene schema. For instance, I have a field with the name "title" in the database and want to be able to use queries: title:query t:query to get the data from the same field. Is there a way to do thi

Re: Delete index and "commit or optimize"

2007-10-25 Thread Otis Gospodnetic
You don't need to optimize an index after deletion, just commit and you are done. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Jae Joo <[EMAIL PROTECTED]> To: solr-user Sent: Thursday, October 25, 2007 3:10:49 PM Subject: Delete index and "

Delete index and "commit or optimize"

2007-10-25 Thread Jae Joo
Hi, I have 9g index and try to delete a couple of document. The actual deletion is working fine. Here is my question. Do I have to OPTIMIZE the index after deleting? or just COMMIT it? The original index already optimized. Thanks, Jae Joo

Re: indexing one documents with different populated fields causes a deletion of documents in with other populated fileds

2007-10-25 Thread Yonik Seeley
On 10/25/07, Anton Valdstein <[EMAIL PROTECTED]> wrote: > Does solr check automatically for duplicate texts in other fields and > delete documents that have the same text stored in other fields? Solr automatically overwrites (deletes old versions of) documents with the same uniqueKey field (no

Re: My filters are not used

2007-10-25 Thread Yonik Seeley
On 10/25/07, Norskog, Lance <[EMAIL PROTECTED]> wrote: > This search has up to 8000 records. does not compute... Are you saying there are 8000 records with contentid:00* > Does this require a query cache of > 8000 records? No, one query == one query cache entry. >When is the query cache filled?

indexing one documents with different populated fields causes a deletion of documents in with other populated fileds

2007-10-25 Thread Anton Valdstein
Hi, I have 2 fields defined in the schema.xml. One is named ItalianTitle and the other is named ItalianOrEnglishTitle_t. I want to index first all the Italian titles into documents having the Italian texts stored and indexed in the ItalianTitle field while these documents should have the field Ital

Re: Forced Top Document

2007-10-25 Thread mark angelillo
Thanks for your thoughts, Chris. I agree with you about the user's experience. Snooth doesn't serve any ads/sponsored results -- the goal here is to make sure that the most recent document the user has acted on shows up top in searches for recent activity. My aim is to forcibly preserve the

RE: My filters are not used

2007-10-25 Thread Norskog, Lance
This search has up to 8000 records. Does this require a query cache of 8000 records? When is the query cache filled? This answers a second question: the filter design is intended for small search sets. I'm interested in selecting maybe 1/10 of a few million records as a search limiter. Is it possi

Re: Payloads for multiValued fields?

2007-10-25 Thread Alf Eaton
Alf Eaton wrote: > Mike Klaas wrote: >> On 24-Oct-07, at 7:10 AM, Alf Eaton wrote: >>> Yes, I was just trying that this morning and it's an improvement, though >>> not ideal if the field contains a lot of text (in other words it's still >>> a suboptimal workaround). >>> >>> I do think it might be u

Re: multilingual list of stopwords

2007-10-25 Thread Maria Mosolova
Thank you very much Daniel! Maria Daniel Alheiros wrote: If you do want more stopwords sources, there is this one too: http://snowball.tartarus.org/algorithms/ And I would go for the language identification and then I would apply the proper set. Cheers, Daniel On 18/10/07 16:18, "Maria Mosol

Re: Forced Top Document

2007-10-25 Thread Walter Underwood
On 10/25/07 12:11 AM, "Chris Hostetter" <[EMAIL PROTECTED]> wrote: > this type of question typically falls into two use cases: > 1) "targeted ads" > 2) "sponsored results" 3) Best bets (editorial results) The query "house" should return "House, M.D." as the first hit, but that is rather hard

Re: prefix-search ingnores the lowerCaseFilter

2007-10-25 Thread Yonik Seeley
On 10/25/07, Max Scheffler <[EMAIL PROTECTED]> wrote: > Is it possible that the prefix-processing ignores the filters? Yes, It's a known limitation that we haven't worked out a fix for yet. The issue is that you can't just run the prefix through the filters because of things like stop words, stemm

prefix-search ingnores the lowerCaseFilter

2007-10-25 Thread Max Scheffler
Hi, I want to perform a prefix-search which ignores cases. To do this I created a fielType called suggest: Entrys (terms) could be 'foo', 'bar'... A request like http://localhost:8983/solr/select/?rows=0&facet=true&q=*:*&facet.field=suggest&facet.prefi

Re: Forced Top Document

2007-10-25 Thread Yonik Seeley
On 10/25/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > : The typical use case, though, is for the featured document to be on top only > : for certain queries. Like in an intranet where someone queries 401K or > : retirement or similar, you want to feature a document about benefits that > : wo

Re: Score customization

2007-10-25 Thread Otis Gospodnetic
Victoria, Either use FunctionQuery's or hack around HitCollector.collect(int, float) in SolrIndexSearcher...and adjust the score using the additional values you mentioned. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Victoria Kaganski <[EM

Score customization

2007-10-25 Thread Victoria Kaganski
Hi! My system uses solr to build a searchable archive of documents. I need to override the default scoring/similarity function because the additional to the query relevancy factors have to be considered. For example, each document has "updated on" and "source" fields, which should influence the sco

RE: extending StandardRequestHandler gives ClassCastException

2007-10-25 Thread Haishan Chen
I am a new Solr user and wonder if anyone can help me these questions. I used Solr to index about two million documents and query on it using standard request handler. I disabled all cache. I found phrase query was substantially slower than the usual query. The statistic I collected is as follo

Re: AW: Converting German special characters / umlaute

2007-10-25 Thread Thomas Traeger
Hi, the SnowballPorterFilterFactory is a complete stemmer that transforms words to their basic form (laufen -> lauf, läufer -> lauf). One part of that process is replacing language specific special characters. So SnowballPorterFilterFactory does what you wanted (beside other things). I menti

Re: Forced Top Document

2007-10-25 Thread Chris Hostetter
: The typical use case, though, is for the featured document to be on top only : for certain queries. Like in an intranet where someone queries 401K or : retirement or similar, you want to feature a document about benefits that : would otherwise rank really low for that query. I have not be able