Re: positionIncrementGap in schema.xml

2008-10-06 Thread sanraj25
Hi, Thanks Erik .I am clear.But when I checked with multiValued=true for a single field ,I gave positionIncrementGap=100.That time also mismatch. for ex, author: John Doe author: Bob Smith a phrase query of doe bob now matched even i specified positionIncrementGap=100.again I changed

Re: How to tokenize/analyze docs for the spellchecker - at indexing and query time

2008-10-06 Thread Martin Grotzke
Hi Jason, what about multi-word searches like harry potter? When I do a search in our index for harry poter, I get the suggestion harry spotter (using spellcheck.collate=true and jarowinkler distance). Searching for harry spotter (we're searching AND, not OR) then gives no results. I asume that

Analysers and Tokenizers

2008-10-06 Thread sanraj25
Hi, In schema.xml we use analyzer tag inside of fieldtype tag.In Analyzer tag some times we specify type=index and type=query. ex: analyzer type=index some times we do not specify any type in analyzer ex: analyzer. In what situation we will use analyzer tag?. and what situation we give

TrimFilterFactory

2008-10-06 Thread sanraj25
Hi, I want to know, under which tokenizer TrimFilterFactory is used?. I used with all tokenizer.But white space not removed front and back.Because first tokenizer split the query then only filter split. So please tell me when we use trimfilter factory thanks -sanraj -- View this

Re: Availability Issues

2008-10-06 Thread sunnyfr
Hi Matthew, What do you mean by post your updates ? Does that mean that you just scp, copy data directory by cron job without using automatic replication. Because really since, I started to turn on autoCommit snapshooter, it does slow down and mess up a bit everything. Did you have had the same

required keyword in all a document

2008-10-06 Thread KLessou
Hi, I would like to find all documents who contain France, Flag, French. I've got docs like this one : doc str name=id.../str str name=k1_enwordA,wordB,france, .../str str name=k2_enwordA,wordB,flag, .../str str name=k3_enwordA,wordB,french, .../str ... /doc I can't make my

Re: Indexing Large Files with Large DataImport: Problems

2008-10-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
Did you get a chance to test with the patch? did it work? On Wed, Oct 1, 2008 at 10:13 AM, Noble Paul നോബിള്‍ नोब्ळ् [EMAIL PROTECTED] wrote: this patch is created from 1.3 (may apply on trunk also) --Noble On Wed, Oct 1, 2008 at 9:56 AM, Noble Paul നോബിള്‍ नोब्ळ् [EMAIL PROTECTED] wrote: I

Re: positionIncrementGap in schema.xml

2008-10-06 Thread Erik Hatcher
Sanraj - did you reindex after adjusting the value of positionIncrementGap? It is an index-time factor. Erik On Oct 6, 2008, at 2:12 AM, sanraj25 wrote: Hi, Thanks Erik .I am clear.But when I checked with multiValued=true for a single field ,I gave

Newbie: Example returns XML problem

2008-10-06 Thread Ward, Martin
Hi all, I am just starting to venture in to the realms of Solr and Lucene as I have a specific requirement that I think it will meet so I downloaded the software and ran the example tutorial: http://lucene.apache.org/solr/tutorial.html Adding the example .XML files worked fine but when I go into

Solr and log files

2008-10-06 Thread Ward, Martin
Hi again, Further to my last missive, I am looking at Solr as a quick way of searching log files previously generated by Postfix. I have tried using the basic Lucene command line interface to add in the log files and try and retrieve data from them, but I can't retrieve the data I need. I guess

Re: Newbie: Example returns XML problem

2008-10-06 Thread Erik Hatcher
Solr returns XML by default, not HTML. That's why. At this point Solr out of the box is an engine - still requires building a user interface around it to be useful to end users. SOLR-620 is a start at changing this :) Erik On Oct 6, 2008, at 8:14 AM, Ward, Martin wrote: Hi

Re: Analysers and Tokenizers

2008-10-06 Thread Erik Hatcher
An analyzer with no type specified is used at both indexing and query parsing time. Sometimes it is desirable to do different tokenization depending on when it is occurring (such as synonym injection). Erik On Oct 6, 2008, at 4:32 AM, sanraj25 wrote: Hi, In schema.xml we

A command is still running..

2008-10-06 Thread sunnyfr
Hi, A command is still running... How can I know exactly what does it do ? Cuz really , it looks like it does something, but I've cron job for delta-import every five minutes and nothing change .. last 20 minutes I would say, nothing changed at all, row fetched or request made .. And I

RE: Newbie: Example returns XML problem

2008-10-06 Thread Ward, Martin
Ah, that would explain it then! Thanks for the quick answer, I can dig out the results from the XML output for now until I find an XML browser or until Solr gets a UI! |\/|artin From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: 06 October 2008 13:17 To: solr-user@lucene.apache.org

Re: required keyword in all a document

2008-10-06 Thread KLessou
MultiFieldQueryParser seems to generate what I want : http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/queryParser/MultiFieldQueryParser.html But is there a Php version ? On Mon, Oct 6, 2008 at 1:24 PM, KLessou [EMAIL PROTECTED] wrote: Hi, I would like to

Re: How to tokenize/analyze docs for the spellchecker - at indexing and query time

2008-10-06 Thread Grant Ingersoll
On Oct 6, 2008, at 3:51 AM, Martin Grotzke wrote: Hi Jason, what about multi-word searches like harry potter? When I do a search in our index for harry poter, I get the suggestion harry spotter (using spellcheck.collate=true and jarowinkler distance). Searching for harry spotter (we're

Re: TrimFilterFactory

2008-10-06 Thread Grant Ingersoll
I'm not exactly sure what you are asking so a little clarification would help, but you might try using the Analysis admin page (I believe it's http://localhost:8983/solr/admin/analysis.jsp in the example) to see how your tokenizer, etc. is working for the various fields. On Oct 6, 2008, at

Re: Newbie: Example returns XML problem

2008-10-06 Thread Erik Hatcher
On Oct 6, 2008, at 8:22 AM, Ward, Martin wrote: Thanks for the quick answer, I can dig out the results from the XML output for now until I find an XML browser or until Solr gets a UI! Do note there are numerous Solr API's out there now, practically for any language you want. Solr can

Re: Solr and log files

2008-10-06 Thread Grant Ingersoll
This really would be up to you to write the parser for your log files that then generates Solr documents. Another, option, possibly is if you can make them CSV files, then you could just upload them. On Oct 6, 2008, at 8:14 AM, Ward, Martin wrote: Hi again, Further to my last missive, I

command is still running ? delta-import?

2008-10-06 Thread sunnyfr
Hi, A command is still running... How can I know exactly what does it do ? Cuz really , it looks like it does something, but I've cron job for delta-import every five minutes and nothing change .. last 20 minutes I would say, nothing changed at all, row fetched or request made .. And I

command still running ??? delta-import

2008-10-06 Thread sunnyfr
Hi, A command is still running... How can I know exactly what does it do ? Cuz really , it looks like it does something, but I've cron job for delta-import every five minutes and nothing change .. last 20 minutes I would say, nothing changed at all, row fetched or request made .. And I

Re: required keyword in all a document

2008-10-06 Thread KLessou
On Mon, Oct 6, 2008 at 1:24 PM, KLessou [EMAIL PROTECTED] wrote: Hi, I would like to find all documents who contain France, Flag, French. I've got docs like this one : doc str name=id.../str str name=k1_enwordA,wordB,france, .../str str name=k2_enwordA,wordB,flag, .../str

cahining copyFields

2008-10-06 Thread Batzenmann
Hi, I recently discovered that the copyFields directive exclusively works for copying fields from the input doc to the indexed/stored doc. I'd like to be able to do the following: excerpt from schema.xml: field name=a type=grandVoodoo indexed=true stored=true multiValue=true/ field name=b

Re: Transitioning from Solr 1.2 to Solr 1.3

2008-10-06 Thread Jason Rennie
We just went through this process. We simply copied the 1.2 index to the new server (while the 1.2 server was live---responding to requests and handling updates) and started it up with 1.3. It worked. I can't promise that you'll have the same experience, but it's worth a try. Also, if I were

Re: How to tokenize/analyze docs for the spellchecker - at indexing and query time

2008-10-06 Thread Walter Underwood
This is why OR is a better choice. With AND, one miss means no results at all. Spelling suggestions will never be good enough to make AND work. wunder On 10/6/08 12:51 AM, Martin Grotzke [EMAIL PROTECTED] wrote: Hi Jason, what about multi-word searches like harry potter? When I do a search

Re: Commit in solr 1.3 can take up to 5 minutes

2008-10-06 Thread Uwe Klosa
I already had the chance to setup a new server for testing. Before deploying my application I checked my solrconfig against the solrconfig from 1.3. And removed the deprecated parameters. I started updating the new index. I ingest 100 documents att a time and then I do a commit(). With 2000

Re: command is still running ? delta-import?

2008-10-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
may be you can do a thread dump of solr . It may throw some light on wht it is upto. kill -3 pid in *nix --Noble On Mon, Oct 6, 2008 at 6:25 PM, sunnyfr [EMAIL PROTECTED] wrote: Hi, A command is still running... How can I know exactly what does it do ? Cuz really , it looks like it does

Re: Calculated Unique Key Field

2008-10-06 Thread Jim Murphy
Thanks, Shalin. Shalin Shekhar Mangar wrote: On Wed, Oct 1, 2008 at 12:08 AM, Jim Murphy [EMAIL PROTECTED] wrote: Question1: Is this the best place to do this? This sounds like a job for http://wiki.apache.org/solr/UpdateRequestProcessor -- Regards, Shalin Shekhar Mangar.

Re: cahining copyFields

2008-10-06 Thread Chris Hostetter
: I recently discovered that the copyFields directive exclusively works for : copying fields from the input doc to the indexed/stored doc. correct. you have to explicitly list the mappings, it's not recursive (as i recall this was done partially because it was easier, but also because it

Re: command is still running ? delta-import?

2008-10-06 Thread sunnyfr
what does that means : kill -3 pid and ? this one is a command too ? in *nix thanks for your advice, Noble Paul നോബിള്‍ नोब्ळ् wrote: may be you can do a thread dump of solr . It may throw some light on wht it is upto. kill -3 pid in *nix --Noble On Mon, Oct 6, 2008 at 6:25

Index updates blocking readers: To Multicore or not?

2008-10-06 Thread Jim Murphy
We have a farm of several Master-Slave pairs all managing a single very large logical index sharded across the master-slaves. We notice on the slaves, after an rsync update, as the index is being committed that all queries are blocked sometimes resulting in unacceptable service times. I'm

Re: dismax and long phrases

2008-10-06 Thread Jon Drukman
Chris Hostetter wrote: It's not a bug in the implementation, it's a side effect of the basic tenent of how dismax works since it inverts the input and creates a DisjunctionMaxQuery for each word in the input, any word that is valid in at least one of the qf fields generates a should clause

Stored field question

2008-10-06 Thread Jake Conk
Hello, I have a field with the following definition... dynamicField name=*_t_ns_mv type=text indexed=true stored=false multiValued=true/ I'm not storing the data because I never need to retrieve it but each *_t_ns_mv field is indexed and has a specific boost value... I added this field with the

Re: required keyword in all a document

2008-10-06 Thread Jason Rennie
Sounds like the DisMax handler would work well for you. I'm no expert, but I'm fairly certain that if you created a solr.DisMaxRequestHandler handler with qf containing those three fields, you could issue the query +france +flag +french and get the desired results. Jason On Mon, Oct 6, 2008 at

spellcheck: issues

2008-10-06 Thread Jason Rennie
I've noticed a few issues with spellcheck as I've been testing it out for use on our site... 1. Rebuild breaks requests - I'm using rebuildOnCommit ATM. If a commit is going on and files are being rebuilt in the spellcheck data dir, spellcheck requests yield bogus answers. I.e. I can

Re: How to tokenize/analyze docs for the spellchecker - at indexing and query time

2008-10-06 Thread Martin Grotzke
On Mon, 2008-10-06 at 09:00 -0400, Grant Ingersoll wrote: On Oct 6, 2008, at 3:51 AM, Martin Grotzke wrote: Hi Jason, what about multi-word searches like harry potter? When I do a search in our index for harry poter, I get the suggestion harry spotter (using spellcheck.collate=true

Re: cahining copyFields

2008-10-06 Thread Axel Tetzlaff
hossman wrote: it ensures the schema creator must clearly thinks through exactly what should be copied where -- otherwise you might have an existing mappings of x-z and y-z that you don't consider/remember when adding a-x (or worse: a-x and a-y) Hmm, I can see your reasoning here.

Re: spellcheck: issues

2008-10-06 Thread Jason Rennie
I've been using spellcheck.count=10 since that seems to yield a much better top result than using the default count of 1. However, I'm still seeing weird cases. Here are a few queries with returned suggestions. Frequency counts are in parenthesis. - query is candyz. Suggestions are: 1.

Re: dismax and long phrases

2008-10-06 Thread Mike Klaas
On 6-Oct-08, at 11:20 AM, Jon Drukman wrote: Chris Hostetter wrote: It's not a bug in the implementation, it's a side effect of the basic tenent of how dismax works since it inverts the input and creates a DisjunctionMaxQuery for each word in the input, any word that is valid in at least

Re: Stored field question

2008-10-06 Thread Yonik Seeley
On Mon, Oct 6, 2008 at 2:37 PM, Jake Conk [EMAIL PROTECTED] wrote: I have a field with the following definition... dynamicField name=*_t_ns_mv type=text indexed=true stored=false multiValued=true/ I'm not storing the data because I never need to retrieve it but each *_t_ns_mv field is

Re: Discarding undefined fields in query

2008-10-06 Thread Chris Hostetter
: req.getSchema().getQueryAnalyzer(); : : I think it's in this analyzer that the undefined field error happens : (because for instance the field 'foo' doesn't exists in the schema, : and so it's impossible to find a specific analyzer to this field in : the schema). Correct. : The strange thing

Re: required keyword in all a document

2008-10-06 Thread Chris Hostetter
: Sounds like the DisMax handler would work well for you. I'm no expert, but : I'm fairly certain that if you created a solr.DisMaxRequestHandler handler : with qf containing those three fields, you could issue the query +france : +flag +french and get the desired results. correct. -Hoss

Re: solr filesystem dependencies

2008-10-06 Thread Ryan McKinley
On Oct 6, 2008, at 5:58 PM, Chris Hostetter wrote: : The only filesystem dependency that I want is the index itself. should we assume you're baking your solrconfig.xml and schema.xml directly into a jar? : The current implementation of the SolrResource seems to suggest that i need : a

Re: command is still running ? delta-import?

2008-10-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
*nix is for different flavours of unix. Sorry , it is not a command I assumed that you r using a linux/unix system. If you are using windows press a pause/break on that window On Mon, Oct 6, 2008 at 11:38 PM, sunnyfr [EMAIL PROTECTED] wrote: what does that means : kill -3 pid and ? this

Re: solr filesystem dependencies

2008-10-06 Thread Chris Hostetter
: To instantiate the schema and core, you should not *need* any files on disk -- : however, many of the plugins expect files like 'stopwords.txt' 'elevate.xml' : etc. They use the ResourceLoader, so (in theory) you could hijack that to : send stuff from your .jar you shouldn't need to hijack