Re: Rules engine and Solr

2010-01-06 Thread Avlesh Singh
Thanks for the revert, Ravi. I am currently working on some of kind rules in front (application side) of our solr instance. These rules are more application specific and are not general. Like deciding which fields to facet, which fields to return in response, which fields to highlight, boost

readOnly=true IndexReader

2010-01-06 Thread Patrick Sauts
In the Wiki page : http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, I've found -Open the IndexReader with readOnly=true. This makes a big difference when multiple threads are sharing the same reader, as it removes certain sources of thread contention. How to open the IndexReader with

Re: readOnly=true IndexReader

2010-01-06 Thread Shalin Shekhar Mangar
On Wed, Jan 6, 2010 at 4:26 PM, Patrick Sauts patrick.via...@gmail.comwrote: In the Wiki page : http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, I've found -Open the IndexReader with readOnly=true. This makes a big difference when multiple threads are sharing the same reader, as it

Re: readOnly=true IndexReader

2010-01-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Jan 6, 2010 at 4:26 PM, Patrick Sauts patrick.via...@gmail.com wrote: In the Wiki page : http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, I've found -Open the IndexReader with readOnly=true. This makes a big difference when multiple threads are sharing the same reader, as it

schema.xml and Xinclude

2010-01-06 Thread Patrick Sauts
As types/ in schema.xml are the same between all our indexes, I'd like to make them an XInclude so I tried : ?xml version=1.0 encoding=UTF-8? schema name=example version=1.2 xmlns:xi=http://www.w3.org/2001/XInclude; xi:include href=solr-types.xml/ fields - - - /schema My Syntax might not

Yankee's Solr integration

2010-01-06 Thread Nicolas Kern
Hello everybody, I was wordering how did Yankee ( http://www.yankeegroup.com/search.do?searchType=advancedSearch) did to provide the possibility to Create Alerts, Save Searches, and generate a RSS Feed out of a custom search using Solr, do you have any idea ? Thanks a lot, Best regards happy

Re: Basic sentence parsing with the regex highlighter fragmenter

2010-01-06 Thread Erick Erickson
Hmmm, the name WordDelimiterFilterFactory might be leading you astray. Its purpose isn't to break things up into words that have anything to do with grammatical rules. Rather, it's purpose is to break up strings of funky characters into searchable stuff. see:

Re: performance question

2010-01-06 Thread A. Steven Anderson
Strictly speaking there is some insignificant distinctions in performance related to how a field name is resolved -- Grant alluded to this earlier in this thread -- but it only comes into play when you actually refer to that field by name and Solr has to look them up in the metadata. So for

Re: performance question

2010-01-06 Thread Erik Hatcher
You don't lose copyField capability with dynamic fields. You can copy dynamic fields into a fixed field name like *_s = text or dynamic fields into another dynamic field like *_s = *_t Erik On Jan 6, 2010, at 9:35 AM, A. Steven Anderson wrote: Strictly speaking there is some

ord on TrieDateField always returning max

2010-01-06 Thread Nagelberg, Kallin
Hi everyone, I've been trying to add a date based boost to my queries. I have a field like: fieldType name=tdate class=solr.TrieDateField omitNorms=true precisionStep=6 positionIncrementGap=0/ field name=datetime type=tdate indexed=true stored=true required=true / When I look at the datetime

replication -- missing field data file

2010-01-06 Thread Giovanni Fernandez-Kincade
I set up replication between 2 cores on one master and 2 cores on one slave. Before doing this the master was working without issues, and I stopped all indexing on the master. Now that replication has synced the index files, an .FDT field is suddenly missing on both the master and the slave.

Re: ord on TrieDateField always returning max

2010-01-06 Thread Yonik Seeley
Besides using up a lot more memory, ord() isn't even going to work for a field with multiple tokens indexed per value (like tdate). I'd recommend using a function on the date value itself. http://wiki.apache.org/solr/FunctionQuery#ms -Yonik http://www.lucidimagination.com On Wed, Jan 6, 2010

RE: ord on TrieDateField always returning max

2010-01-06 Thread Nagelberg, Kallin
Thanks Yonik, I was just looking at that actually. Trying something like recip(ms(NOW,datetime),3.16e-11,1,1)^10 now. My 'inspiration' for the ord method was actually the Solr 1.4 Enterprise Search server book. Page 126 has a section 'using reciprocals and rord with dates'. You should let those

Re: ord on TrieDateField always returning max

2010-01-06 Thread Yonik Seeley
On Wed, Jan 6, 2010 at 11:26 AM, Nagelberg, Kallin knagelb...@globeandmail.com wrote: Thanks Yonik, I was just looking at that actually. Trying something like recip(ms(NOW,datetime),3.16e-11,1,1)^10  now. I'd also recommend looking into a multiplicative boost too - IMO they normally make more

Re: performance question

2010-01-06 Thread A. Steven Anderson
You don't lose copyField capability with dynamic fields. You can copy dynamic fields into a fixed field name like *_s = text or dynamic fields into another dynamic field like *_s = *_t Ahhh...I missed that little detail. Nice! Ok, so there are no negatives to using dynamic fields then.

Re: replication -- missing field data file

2010-01-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Jan 6, 2010 at 9:49 PM, Giovanni Fernandez-Kincade gfernandez-kinc...@capitaliq.com wrote: I set up replication between 2 cores on one master and 2 cores on one slave. Before doing this the master was working without issues, and I stopped all indexing on the master. Now that

solr and patch - SOLR-64 SOLR-792

2010-01-06 Thread Thibaut Lassalle
hi, I tried to apply patches to solr-1.4 Here is the result javad...@javadev5:~/Java/apache-solr-1.4.0$ patch -p0 SOLR-64.patch patching file src/java/org/apache/solr/schema/HierarchicalFacetField.java patching file src/common/org/apache/solr/common/params/FacetParams.java Hunk #1 FAILED at

Re: solr and patch - SOLR-64 SOLR-792

2010-01-06 Thread Erik Hatcher
You probably aren't doing anything wrong, other than those patches are a bit out of date with trunk. You might have to fight through getting them current a bit, or wait until I or someone else can get to updating them. Erik On Jan 6, 2010, at 11:52 AM, Thibaut Lassalle wrote:

RE: replication -- missing field data file

2010-01-06 Thread Giovanni Fernandez-Kincade
How can you differentiate between the backup and the normal index files? -Original Message- From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble Paul ??? ?? Sent: Wednesday, January 06, 2010 11:52 AM To: solr-user Subject: Re: replication -- missing field

Re: replication -- missing field data file

2010-01-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
the index dir is in the name index others will be stored as indexdate-as-number On Wed, Jan 6, 2010 at 10:31 PM, Giovanni Fernandez-Kincade gfernandez-kinc...@capitaliq.com wrote: How can you differentiate between the backup and the normal index files? -Original Message- From:

Re: Solr Cell - PDFs plus literal metadata - GET or POST ?

2010-01-06 Thread Ross
On Tue, Jan 5, 2010 at 2:25 PM, Giovanni Fernandez-Kincade gfernandez-kinc...@capitaliq.com wrote: Really? Doesn't it have to be delimited differently, if both the file contents and the document metadata will be part of the POST data? How does Solr Cell tell the difference between the

Strange Behavior When Using CSVRequestHandler

2010-01-06 Thread danben
The problem: Not all of the documents that I expect to be indexed are showing up in the index. The background: I start off with an empty index based on a schema with a single field named 'query', marked as unique and using the following analyzer: analyzer type=index tokenizer

Re: Solr 1.4 - stats page slow

2010-01-06 Thread Stephen Weiss
Sorry, know I'm a little late in replying but the LukeRequestHandler tip was just what I needed! Thank you so much. -- Steve On Dec 25, 2009, at 2:03 AM, Chris Hostetter wrote: : I've noticed this as well, usually when working with a large field cache. I : haven't done in-depth analysis

How to ignore term frequency 1? Field-specific Similarity class?

2010-01-06 Thread Andreas Schwarz
Hi, I want to modify scoring to ignore term frequency 1. This is useful for short fields like titles or subjects, where the number of times a term appears does not correspond to relevancy. I found several discussions of this problem, and also an implementation that changes the Similarity

Re: Basic sentence parsing with the regex highlighter fragmenter

2010-01-06 Thread Caleb Land
I've looked at the docs/source for WordDelimiterFilter, and I understand what it does now. Here is my configuration: http://gist.github.com/270590 I've tried the StandardTokenizerFactory instead of the WhitespaceTokenizerFactory, but I get the same problem as before, a the period from the

No Analyzer, tokenizer or stemmer works at Solr

2010-01-06 Thread MitchK
I have tested a lot and all the time I thought I set wrong options for my custom analyzer. Well, I have noticed that Solr isn't using ANY analyzer, filter or stemmer. It seems like it only stores the original input. I am using the example-configuration of the current Solr 1.4 release. What's

RE: replication -- missing field data file

2010-01-06 Thread Giovanni Fernandez-Kincade
How can you tell when the backup is done? -Original Message- From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble Paul ??? ?? Sent: Wednesday, January 06, 2010 12:23 PM To: solr-user Subject: Re: replication -- missing field data file the index dir is in

Re: Strange Behavior When Using CSVRequestHandler

2010-01-06 Thread Erick Erickson
I think the root of your problem is that unique fields should NOT be multivalued. See http://wiki.apache.org/solr/FieldOptionsByUseCase?highlight=(unique)|(key) http://wiki.apache.org/solr/FieldOptionsByUseCase?highlight=(unique)|(key)In this case, since you're tokenizing, your query field is

Re: Basic sentence parsing with the regex highlighter fragmenter

2010-01-06 Thread Erick Erickson
Hmmm, I'll have to defer to the highlighter experts here Erick On Wed, Jan 6, 2010 at 3:23 PM, Caleb Land redhatd...@gmail.com wrote: I've looked at the docs/source for WordDelimiterFilter, and I understand what it does now. Here is my configuration: http://gist.github.com/270590

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-06 Thread Erick Erickson
Well, I have noticed that Solr isn't using ANY analyzer How do you know this? Because it's highly unlikely that SOLR is completely broken on that level. Erick On Wed, Jan 6, 2010 at 3:48 PM, MitchK mitc...@web.de wrote: I have tested a lot and all the time I thought I set wrong options

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-06 Thread Ryan McKinley
On Jan 6, 2010, at 3:48 PM, MitchK wrote: I have tested a lot and all the time I thought I set wrong options for my custom analyzer. Well, I have noticed that Solr isn't using ANY analyzer, filter or stemmer. It seems like it only stores the original input. The stored value is always

How to set User.dir or CWD for Solr during Tomcat startup

2010-01-06 Thread Turner, Robbin J
Is there anyway to force the cwd that solr starts up in when using the standard startup scripts for tomcat? I'm working on solaris and using the SMF to start and stop tomcat sets the path to /root. I've been doing a bunch of googling and haven't seen if there is a parameter to set within

Search query log using solr

2010-01-06 Thread Ravi Gidwani
Hi All: I am currently using solr 1.4 as the search engine for my application. I am planning to add a search query log that will capture all the search queries (and more information like IP,user info,date time,etc). I understand I can easily do this on the application side capturing all

Re: DisMaxRequestHandler bf configuration

2010-01-06 Thread Yonik Seeley
On Wed, Jan 6, 2010 at 2:43 AM, Andy angelf...@yahoo.com wrote: I'd like to boost every query using {!boost b=log(popularity)}. But I'd rather not have to prepend that to every query. It'd be much cleaner for me to configure Solr to use that as default. My plan is to make

Re: DisMaxRequestHandler bf configuration

2010-01-06 Thread Andy
So if I want to configure Solr to turn every query q=foo into q={!boost b=log(popularity)}foo, dismax wouldn't work but edismax would? If that's the case, can you tell me how to set up/use edismax? I can't find much documentation on it. Is it recommended for production use? --- On Wed,

Re: DisMaxRequestHandler bf configuration

2010-01-06 Thread Yonik Seeley
On Wed, Jan 6, 2010 at 7:43 PM, Andy angelf...@yahoo.com wrote: So if I want to configure Solr to turn every query q=foo into q={!boost b=log(popularity)}foo, dismax wouldn't work but edismax would? You can do it with dismax it's just that the syntax is slightly more convoluted. Check out

Re: SOLR or Hibernate Search?

2010-01-06 Thread Márcio Paulino
Hi! Thanks for the answers. These were crucial to my decision. I've adapted the solr in my application. On Wed, Dec 30, 2009 at 2:00 AM, Ryan McKinley ryan...@gmail.com wrote: If you need to search via the Hibernate API, then use hibernate search. If you need a scaleable HTTP (REST) then

Re: DisMaxRequestHandler bf configuration

2010-01-06 Thread Andy
I meant can I do it with dismax without modifying every single query? I'm accessing Solr through haystack and all queries are generated by haystack. I'd much rather not have to go under haystack to modify the generated queries.  Hence I'm trying to find a way to boost every query by default.

Re: DisMaxRequestHandler bf configuration

2010-01-06 Thread Yonik Seeley
On Wed, Jan 6, 2010 at 8:24 PM, Andy angelf...@yahoo.com wrote: I meant can I do it with dismax without modifying every single query? I'm accessing Solr through haystack and all queries are generated by haystack. I'd much rather not have to go under haystack to modify the generated queries. 

Re: DisMaxRequestHandler bf configuration

2010-01-06 Thread Andy
Let me make sure I understand you. I'd get my regular query from haystack as qq=foo rather than q=foo. Then I put in solrconfig within the dismax section: str name=q.alt      {!boost b=$popularityboost v=$qq}popularityboost=log(popularity) /str Is that what you meant? --- On Wed,

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-06 Thread MitchK
Hello Erick, thank you for answering. I can do whatever I want - Solr does nothing. For example: If I use the textgen-fieldtype which is predefined, nothing happens to the text. Even the stopFilter is not working - no stopword from stopword.txt was replaced. I think, that this only affects the

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-06 Thread MitchK
Hello Ryan, thank you for answering. In my schema.xml I am defining the field as indexed = true. The problem is: nothing, even the original predefined analyzers don't work anyway. Please, have a look on my response to Erick. Mitch P.S. Oh, I see what you mean. The field is indexed = true. My

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-06 Thread Erik Hatcher
Mitch, Again, I think you're misunderstanding what analysis does. You must be expecting we think, though you've not provided exact duplication steps to be sure, that the value you get back from Solr is the analyzer processed output. It's not, it's exactly what you provide. Internally