About the example in the wiki page of FunctionQuery

2010-06-17 Thread Chia Hao Lo
( I've sent this mail two days ago, but I cannot find it in the mail archive. So I guess the mail is not sent successfully. Sorry for sending this mail twice in case that it did send. ) Hi, I'm a newbie to Solr and have a question about the example in FunctionQuery. I've read the document

Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-17 Thread Otis Gospodnetic
Mitch, If you use Nutch+Solr then you wouldn't *index* the fetched content with Nutch. Solr doesn't know anything about OPIC, but I suppose you can feed the OPIC score computed by Nutch into a Solr field and use it during scoring, if you want, say with a function query. Yes, ES has built-in

Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-17 Thread MitchK
Solr doesn't know anything about OPIC, but I suppose you can feed the OPIC score computed by Nutch into a Solr field and use it during scoring, if you want, say with a function query. Oh! Yes, that makes more sense than using the OPIC as doc-boost-value. :-) Anywhere at the Lucene Mailing

Re: Field Collapsing SOLR-236

2010-06-17 Thread Rakhi Khatwani
Hi Moazzam, Yup i hv encountered the same thing. Build errors after applying the patch. Rakhi On Thu, Jun 17, 2010 at 3:33 AM, Moazzam Khan moazz...@gmail.com wrote: I got the code from trunk again and now I get this error: [javac] symbol : class StringIndex [javac]

Get total number of results when field collapsing is enabled

2010-06-17 Thread Adrian Pemsel
Hi Folks, Is there any way to get or estimate the total number of results when using field collapsing (SOLR-236) without using faceting or a second query? Kind Regards, Adrian Pemsel -- http://www.jusmeum.de

RejectedExecutionException when shutttingdown corecontainer

2010-06-17 Thread NarasimhaRaju
Hi, I am using solr 1.3 and when indexing i am getting RejectedExecutionException after processing the last batch of update records from the database. happening when coreContainer.shutdown() is called after processing the last record. i have autocommits enabled based on maxTime which is 10

Re: how to apply patch SOLR-1316

2010-06-17 Thread Koji Sekiguchi
As you can see both versions don't appear to be working. I tried building each but neither would compile. Which version/tag should be used when applying this patch? In general, a patch is written against the latest trunk branch as of then. For the SOLR-1316.patch, it was posted 2010-5-31,

Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-17 Thread Otis Gospodnetic
Mitch, Yes, one day. But it sounds like you are not aware of ExternalFieldFile, which you can use today: http://search-lucene.com/?q=ExternalFileFieldfc_project=Solr Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/

Re: About the example in the wiki page of FunctionQuery

2010-06-17 Thread Otis Gospodnetic
Hi, I think that + there is just a space (like %20). Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Chia Hao Lo fca...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, June 17,

Re: Indexing HTML files in SOLR

2010-06-17 Thread seesiddharth
Thank you so much for the reply...The link suggested by you is helpful but they have explain everything with use of curl command which I don't want to use. I was more interested in uploading the .html documents using HTTP web request. So I have stored all .html files at one location then

Autsuggest/autocomplete/spellcheck phrases

2010-06-17 Thread Blargy
How can I preserve phrases for either autosuggest/autocomplete/spellcheck? For example we have a bunch of product listings and I want if someone types: louis for it to common up with Louis Vuitton. World ... World cup. Would I need n-grams? Shingling? Thanks -- View this message in context:

Document boosting troubles

2010-06-17 Thread dbashford
Brand new to this sort of thing so bear with me. For sake of simplicity, I've got a two field document, title and rank. Title gets searched on, rank has values from 1 to 10. 1 being highest. What I'd like to do is boost results of searches on title based on the documents rank. Because it's

Re: Autsuggest/autocomplete/spellcheck phrases

2010-06-17 Thread Michael
Blargy, I've been experimenting with this myself for a work project. What I did was use a combination of the two running the indexed terms through the Shingle factory and then through the edge n-gram filter. I did this in order to be able to match terms like : .net asp c# asp .net c# c# asp .net

Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-17 Thread MitchK
Otis, you are right. I wasn't aware of this. At least not with such a large dataList (let's think of an index with 4mio docs, this would mean we got an ExternalFile with 4mio records). But from what I've read at search-lucene.com it seems to perform very well. Thanks for the idea! Btw: Otis,

solr multi-node

2010-06-17 Thread Antonello Mangone
Hi to every one I have a question and I hope someone can help me. I know that mission critical reliability can be implemented with Lucene/Solr by using multi-node configurations, and redundant architectures, but I haven't found documentation on how to do it. Can someone help me to find a link to

Re: Document boosting troubles

2010-06-17 Thread MitchK
Hi, first of all, are you sure that row.put('$docBoost',docBoostVal) is correct? I think it should be row.put($docBoost,docBoostVal); - unfortunately I am not sure. Hm, I think, until you can solve the problem with the docBoosts itself, you should use a functionQuery. Use div(1, rank) as

Re: Document boosting troubles

2010-06-17 Thread MitchK
Sorry, I've overlooked your other question. str name=bf rank:1^10.0 rank:2^9.0 rank:3^8.0 rank:4^7.0 rank:5^6.0 rank:6^5.0 rank:7^4.0 rank:8^3.0 rank:9^2.0 /str This is wrong. You need to change bf to bq. Bf - boosting function Bq - boosting query. -- View this

Re: Autsuggest/autocomplete/spellcheck phrases

2010-06-17 Thread Blargy
Thanks for the reply Michael. Ill definitely try that out and let you know how it goes. Your solution sounds similar to the one I've read here: http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ There are some good comments in there too. I

Re: solr multi-node

2010-06-17 Thread MitchK
Antonello, here are a few links to the Solr Wiki: http://wiki.apache.org/solr/SolrReplication Solr Replication http://wiki.apache.org/solr/DistributedSearchDesign Distributed Search Design http://wiki.apache.org/solr/DistributedSearch Distributed Search http://wiki.apache.org/solr/SolrCloud

Re: Master master?

2010-06-17 Thread MitchK
What is the usecase for such an architecture? Do you send requests to two different masters for indexing and that's why they need to be synchronized? Kind regards - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/Master-master-tp884253p903233.html Sent from the Solr -

Re: Autsuggest/autocomplete/spellcheck phrases

2010-06-17 Thread Michael
We base the auto-suggest on popular searches. Our site logs the search terms in a database and a simple query can give us a summary counting the number of times the search was entered and the number of results it returned, similar to the criteria used in the lucid imagination article you cite.

Re: Field Collapsing SOLR-236

2010-06-17 Thread Moazzam Khan
Hi Mark, Thanks for posting those links. I know this is probably a dumb question, but how do I make Solr work through your repository? I ask this because I don't see a build xml file and the folder structure is a bit different (I'm guessing I am not supposed to use ant on that :D) Thanks,

Plural only stemmer

2010-06-17 Thread Rachel Arbit
Hi all, I'm having trouble finding a stemmer that's less aggressive than the porter-stemmer, ideally, one that does only plural stemming. I've been trying to get KStem to work by copying the lucid-kstem and lucid-solr-kstem jars from the lucid distribution into solr/lib, but I get a classNotFound

Re: Autsuggest/autocomplete/spellcheck phrases

2010-06-17 Thread Blargy
Ok that makes perfect sense. What I did was use a combination of the two running the indexed terms through - I initially read this as you used your current index and use the terms from that to buildup your dictionary. -- View this message in context:

Re: Plural only stemmer

2010-06-17 Thread Ahmet Arslan
I'm having trouble finding a stemmer that's less aggressive than the porter-stemmer, ideally, one that does only plural stemming. Looks like PlingStemmer does this. http://www.mpi-inf.mpg.de/yago-naga/javatools/doc/javatools/parsers/PlingStemmer.html

federated / meta search

2010-06-17 Thread Sascha Szott
Hi folks, if I'm seeing it right Solr currently does not provide any support for federated / meta searching. Therefore, I'd like to know if anyone has already put efforts into this direction? Moreover, is federated / meta search considered a scenario Solr should be able to deal with at all or

Re: Field Collapsing SOLR-236

2010-06-17 Thread Mark Diggory
Correct, it uses maven and just constructs the War executable, its upto you to configure the location of your solr home directory still. svn co https://scm.dspace.org/svn/repo/modules/dspace-solr/trunk solr cd solr mvn package then you can go into the webapp/target directory and get the

Re: Field Collapsing SOLR-236

2010-06-17 Thread Erik Hatcher
On Jun 16, 2010, at 7:31 PM, Mark Diggory wrote: p.s. I'd be glad to contribute our Maven build re-organization back to the community to get Solr properly Mavenized so that it can be distributed and released more often. For us the benefit of this structure is that we will be able to

Re: Document boosting troubles

2010-06-17 Thread dbashford
One problem down, two left! =) bf == bq did the trick, thanks. Now at least if I can't get the DIH solution working I don't have to tack that on every query string. Taking the quotes away from $docBoost results in a syntax error. Needs to be quoted. Changed it up to this and still no luck

DismaxRequestHandler

2010-06-17 Thread Blargy
I have a title field and a description filed. I am searching across both fields but I don't want description matches unless they are within some slop of each other. How can I query for this? It seems that im getting back crazy results when there are matches that are nowhere each other -- View

Re: Field Collapsing SOLR-236

2010-06-17 Thread Martijn v Groningen
I've added a new patch to the issue, so building the trunk (rev 955615) with the latest patch should not be a problem. Due to recent changes in the Lucene trunk the patch was not compatible. On 17 June 2010 20:20, Erik Hatcher erik.hatc...@gmail.com wrote: On Jun 16, 2010, at 7:31 PM, Mark

Re: DismaxRequestHandler

2010-06-17 Thread Joe Calderon
the qs parameter affects matching , but you have to wrap your query in double quotes,ex q=oil spillqf=title descriptionqs=4defType=dismax im not sure how to formulate such a query to apply that rule just to description, maybe with nested queries ... On Thu, Jun 17, 2010 at 12:01 PM, Blargy

Re: Field Collapsing SOLR-236

2010-06-17 Thread Moazzam Khan
I knew it wasn't me! :) I found the patch just before I read this and applied it to the trunk and it works! Thanks Mark and martijn for all your help! - Moazzam On Thu, Jun 17, 2010 at 2:16 PM, Martijn v Groningen martijn.is.h...@gmail.com wrote: I've added a new patch to the issue, so

RE: federated / meta search

2010-06-17 Thread Markus Jelsma
Hi,   Check out Solr sharding [1] capabilities. I never tested it with different schema's but if each node is queried with fields that it supports, it should return useful results.   [1]: http://wiki.apache.org/solr/DistributedSearch   Cheers.   -Original message- From: Sascha

Re: solr multi-node

2010-06-17 Thread Antonello Mangone
Mitch, thank you very much for your help, I'll read all the links you gave me. 2010/6/17 MitchK mitc...@web.de Antonello, here are a few links to the Solr Wiki: http://wiki.apache.org/solr/SolrReplication Solr Replication http://wiki.apache.org/solr/DistributedSearchDesign Distributed

Re: Plural only stemmer

2010-06-17 Thread Robert Muir
I created LUCENE-2503 to address this. On Thu, Jun 17, 2010 at 12:56 PM, Rachel Arbit rac...@lookin2.com wrote: Hi all, I'm having trouble finding a stemmer that's less aggressive than the porter-stemmer, ideally, one that does only plural stemming. I've been trying to get KStem to work by

Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-17 Thread Otis Gospodnetic
I didn't open the issue, Mitch, but feel free to do it. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: MitchK mitc...@web.de To: solr-user@lucene.apache.org Sent: Thu, June 17, 2010

Question on dynamic fields

2010-06-17 Thread bbarani
Hi, I am facing some issue with dynamic fields. I have 2 fields (UID and ID) on which I want to do whole word search only.. I made those 2 fields to be of type 'string'. field name=uid type=string indexed=true stored=true/ I also have a dynamic field with textgen field type as below

defType=Dismax questions

2010-06-17 Thread Blargy
Sorry for the repost but I posted under DismaxRequestHandler when I should have listed it as DismaxQueryParser.. ie im using defType=dismax I have a title field and a description filed. I am searching across both fields but I don't want description matches unless they are within some slop of

DataImportHandler + docBoost

2010-06-17 Thread dbashford
Pulled this out of another thread of mine as it's the only bit left that I haven't been able to figure out. Can someone show me briefly how one would include a docBoost inside a DIH? I've got something like this... var rank = row.get('rank'); switch (rank) {

[ANN] Free Webinar: June 24: How Cisco uses Lucene/Solr w/ Social Networks

2010-06-17 Thread Chris Hostetter
(cross posted announcement, please keep any replies to gene...@lucene) On behalf of Lucid Imagination, I'd like to invite folks to a free Webinar we're hosting on June 24th... How Cisco’s Pulse uses Lucene/Solr to put Social Networks to Work Thursday, June 24, 2010

Re: Document boosting troubles

2010-06-17 Thread MitchK
Hi, One problem down, two left! =) bf == bq did the trick, thanks. Now at least if I can't get the DIH solution working I don't have to tack that on every query string. I would really recommend to use a boost function. If your rank will change in future implementations, you do not

Re: DismaxRequestHandler

2010-06-17 Thread MitchK
Joe, please, can you provide an example of what you are thinking of? Subqueries with Solr... I've never seen something like that before. Thank you! Kind regards - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/DismaxRequestHandler-tp903641p904142.html Sent from

Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-17 Thread MitchK
Otis, And again I wished I were registred. I will check the JIRA and when I feel comfortable with it, I will open it. Kind regards - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-and-Nutch-Droids-to-use-or-not-to-use-tp900069p904145.html Sent from the Solr -

Re: Question on dynamic fields

2010-06-17 Thread MitchK
Barani, without more background on dynamic fields, I would say that the most easiest way would be to define a suffix for each of the fields you want to index into the mentioned dynamic field and to redefine your dynamic field - condition. If suffix does not work, because of other dynamic-field

Exact match on a filter

2010-06-17 Thread Pete Chudykowski
Hi, I'm trying with no luck to filter on the exact-match value of a field. Speciffically: fq=brand:apple returns document's whose 'brand' field contains values like apple bottoms. Is there a way to formulate the fq expression to match precisely and only apple ? Thanks in advance for your

Re: Exact match on a filter

2010-06-17 Thread Joe Calderon
use a copyField and index the copy as type string, exact matches on that field should then work as the text wont be tokenized On Thu, Jun 17, 2010 at 3:13 PM, Pete Chudykowski pchudykow...@shopzilla.com wrote: Hi, I'm trying with no luck to filter on the exact-match value of a field.

Re: DismaxRequestHandler

2010-06-17 Thread Joe Calderon
see yonik's post on nested queries http://www.lucidimagination.com/blog/2009/03/31/nested-queries-in-solr/ so for example i thought you could possibly do a dismax query across the main fields (in this case just title) and OR that with _query_:{!description:'oil spill'~4} On Thu, Jun 17, 2010 at

Re: Exact match on a filter

2010-06-17 Thread Erik Hatcher
And when you do that, a best practice for fq'ing on a string field is: fq={!raw f=field_name}value That avoids query parsing and the hassles associated with escaping special characters. Erik On Jun 17, 2010, at 6:22 PM, Joe Calderon wrote: use a copyField and index the copy

RE: Exact match on a filter

2010-06-17 Thread Pete Chudykowski
Wonderful, Thank you both. Pete. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: Thursday, June 17, 2010 3:33 PM To: solr-user@lucene.apache.org Subject: Re: Exact match on a filter And when you do that, a best practice for fq'ing on a string field is:

Re: Spellcheck and Solrconfig

2010-06-17 Thread Chris Hostetter
: We use Solr along with Drupal for our content management needs. The : solrconfig.xml that we have from Drupal mentions that we do not : spellcheck by default and here is our request handler from : solrconfig.xml. : : First question - why is it recommended that we do not spellcheck by :

Re: federated / meta search

2010-06-17 Thread Joe Calderon
yes, you can use distributed search across shards with different schemas as long as the query only references overlapping fields, i usually test adding new fields or tokenizers on one shard and deploy only after i verified its working properly On Thu, Jun 17, 2010 at 1:10 PM, Markus Jelsma

dismax and AND as the default operator

2010-06-17 Thread Tommy Chheng
I'm using the dismax request handler and want to set the default operator to AND. Using the standard handler, i could just use the q.op or defaultOperator in the schema, but this doesn't work using the dismax request handler. For example, if I call solr/select/?q=fuel+cell, I want solr to

Re: dismax and AND as the default operator

2010-06-17 Thread Chris Hostetter
: I'm using the dismax request handler and want to set the default operator to : AND. : Using the standard handler, i could just use the q.op or defaultOperator in : the schema, but this doesn't work using the dismax request handler. : : For example, if I call solr/select/?q=fuel+cell, I want

Re: Optimize with waitFlush=false and waitSearcher=false takes a long time

2010-06-17 Thread Chris Hostetter
: Because waitFlush doesn't work currently, your client i didn't realize waitFlush is currently ignored ... is that an open bug in Jira, or was it a neccessary change because of something else? do we at least log an warning if someone tries to use waitFlush=false? -Hoss

Re: ranking question

2010-06-17 Thread Chris Hostetter
: I want to reorder the results as per function like : sum(w0*score, w1*field1, w2*field2, w3*filed3,..) : : I am using solr1.4 and it seems it does not support sort by function. : : How can this be achieved : : I tried using : q=(query)^w0 (_val_:field1)^w1 (_val_:field2...)^w2 try

Solr Project Structure (was Re: Field Collapsing SOLR-236)

2010-06-17 Thread Mark Diggory
Erik, I try not to be exclusionary of others development tool choices in the selection of my own. However, just to surely stir up a nest of hornets in true Apache fashion... when I saw what was done with the templating of the Maven pom work that was originally donated to solr, I just cringed

Autocompletion with Solritas

2010-06-17 Thread Ken Krugler
I don't believe Solritas supports autocompletion out of the box. So I'm wondering if anybody has experience using the LucidWorks distro Solritas, plus the AJAX Solr auto-complete widget. I realize that AJAX Solr's autocomplete support is mostly just leveraging the jQuery Autocomplete

Re: Solr Project Structure (was Re: Field Collapsing SOLR-236)

2010-06-17 Thread Erik Hatcher
On Jun 17, 2010, at 7:44 PM, Mark Diggory wrote: when I saw what was done with the templating of the Maven pom work that was originally donated to solr, I just cringed at it. Most of us Solr committers are fairly anti-Maven or ambivalent about it at best, so it hasn't gotten much TLC,

Re: dismax and AND as the default operator

2010-06-17 Thread Erik Hatcher
dismax does not support the operator AND. It uses +/- only. set mm=100% (not 1), as Hoss said, and try your query again. Erik On Jun 17, 2010, at 8:08 PM, Tommy Chheng wrote: I don't think setting the mm helps. I have mm to 1 which means the query terms should be in at least one

Re: dismax and AND as the default operator

2010-06-17 Thread Erik Hatcher
Hmmm, maybe I'm wrong and it does support AND. Looking at the code I don't see why it wouldn't, actually. Though I believe I've seen it documented that it isn't supported (or at least not advertised to support). Ok, from the dismax wiki page it says: This query handler supports an

Peformance tuning

2010-06-17 Thread Blargy
After indexing our item descriptions our index grew from around 3gigs to now 17.5 and I can see our search has deteriorated from sub 50ms searches to over 500ms now. The sick thing is I'm not even searching across that field at the moment but I plan to in the near future as well as include

Re: Peformance tuning

2010-06-17 Thread Erik Hatcher
first step is to do an debugQuery=true and see where the time is going on the server-side. If you're doing highlighting of a stored field, that can be a biggie. The timings will be in the debug output - be sure to look at both sections of the timings. Erik On Jun 17, 2010, at

Re: Peformance tuning

2010-06-17 Thread Blargy
Is there an alternative for highlighting on a large stored field? I thought for highlighting you needed the field stored? I really just need the excerpting feature for highlighting relevant portions of our item descriptions. Not sure if this is because of the index size (17.5G) or because of

Re: Peformance tuning

2010-06-17 Thread Erik Hatcher
Blargy - Please try to quote the mail you're responding to, at least the relevant piece. It's nice to see some context to the discussion. On Jun 17, 2010, at 10:23 PM, Blargy wrote: Is there an alternative for highlighting on a large stored field? Not currently. I thought for

Re: Autocompletion with Solritas

2010-06-17 Thread Erik Hatcher
Your wish is my command. Check out trunk, fire up Solr (ant run- example), index example data, hit http://localhost:8983/solr/browse - type in search box. Just used jQuery's autocomplete plugin and the terms component for now, on the name field. Quite simple to plug in, actually. Check

Re: Autocompletion with Solritas

2010-06-17 Thread Ken Krugler
You, sir, are on my Christmas card list. I'll fire it up tomorrow morning let you know how it goes. -- Ken On Jun 17, 2010, at 8:34pm, Erik Hatcher wrote: Your wish is my command. Check out trunk, fire up Solr (ant run- example), index example data, hit http://localhost:8983/solr/browse

Re: Peformance tuning

2010-06-17 Thread Blargy
Blargy - Please try to quote the mail you're responding to, at least the relevant piece. It's nice to see some context to the discussion. No problem ;) Depends - if you optimize the index on the master, then the entire index is replicated. If you simply commit and let Lucene take care of

Re: Peformance tuning

2010-06-17 Thread Otis Gospodnetic
You may want to try the RPM tool, it will show you what inside of that QueryComponent is really slow. http://blog.sematext.com/2010/05/11/solr-performance-monitoring-announcement/ Or you can run Solr under your own profiler. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch

Re: Peformance tuning

2010-06-17 Thread Otis Gospodnetic
Hi, Smaller merge factor will make things worse - it will cause Lucene to merge index segments more often (than the default merge factor of 10), thus resulting in more new files being created on the master, thus resulting in more network IO, more disk IO on the slaves, more OS cache evicted on