Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-06 Thread Erik Hatcher
Mitch, Again, I think you're misunderstanding what analysis does. You must be expecting we think, though you've not provided exact duplication steps to be sure, that the value you get back from Solr is the analyzer processed output. It's not, it's exactly what you provide. Internally f

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-06 Thread MitchK
Hello Ryan, thank you for answering. In my schema.xml I am defining the field as "indexed = true". The problem is: nothing, even the original predefined analyzers don't work anyway. Please, have a look on my response to Erick. Mitch P.S. Oh, I see what you mean. The field is indexed = true. My

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-06 Thread MitchK
Hello Erick, thank you for answering. I can do whatever I want - Solr does nothing. For example: If I use the textgen-fieldtype which is predefined, nothing happens to the text. Even the stopFilter is not working - no stopword from stopword.txt was replaced. I think, that this only affects the i

Re: DisMaxRequestHandler bf configuration

2010-01-06 Thread Andy
Let me make sure I understand you. I'd get my regular query from haystack as qq=foo rather than q=foo. Then I put in solrconfig within the dismax section:       {!boost b=$popularityboost v=$qq}&popularityboost=log(popularity) Is that what you meant? --- On Wed, 1/6/10, Yonik Seeley

Re: DisMaxRequestHandler bf configuration

2010-01-06 Thread Yonik Seeley
On Wed, Jan 6, 2010 at 8:24 PM, Andy wrote: > I meant can I do it with dismax without modifying every single query? I'm > accessing Solr through haystack and all queries are generated by haystack. > I'd much rather not have to go under haystack to modify the generated > queries.  Hence I'm tryi

Re: DisMaxRequestHandler bf configuration

2010-01-06 Thread Andy
I meant can I do it with dismax without modifying every single query? I'm accessing Solr through haystack and all queries are generated by haystack. I'd much rather not have to go under haystack to modify the generated queries.  Hence I'm trying to find a way to boost every query by default. --

Re: SOLR or Hibernate Search?

2010-01-06 Thread Márcio Paulino
Hi! Thanks for the answers. These were crucial to my decision. I've adapted the solr in my application. On Wed, Dec 30, 2009 at 2:00 AM, Ryan McKinley wrote: > If you need to search via the Hibernate API, then use hibernate search. > > If you need a scaleable HTTP (REST) then solr may be the wa

Re: DisMaxRequestHandler bf configuration

2010-01-06 Thread Yonik Seeley
On Wed, Jan 6, 2010 at 7:43 PM, Andy wrote: > So if I want to configure Solr to turn every query q=foo into q={!boost > b=log(popularity)}foo, dismax wouldn't work but edismax would? You can do it with dismax it's just that the syntax is slightly more convoluted. Check out the section on boo

Re: DisMaxRequestHandler bf configuration

2010-01-06 Thread Andy
So if I want to configure Solr to turn every query q=foo into q={!boost b=log(popularity)}foo, dismax wouldn't work but edismax would? If that's the case, can you tell me how to set up/use edismax? I can't find much documentation on it. Is it recommended for production use? --- On Wed, 1/6/10,

Re: DisMaxRequestHandler bf configuration

2010-01-06 Thread Yonik Seeley
On Wed, Jan 6, 2010 at 2:43 AM, Andy wrote: > I'd like to boost every query using {!boost b=log(popularity)}. But I'd > rather not have to prepend that to every query. It'd be much cleaner for me > to configure Solr to use that as default. > > My plan is to make DisMaxRequestHandler the default

Search query log using solr

2010-01-06 Thread Ravi Gidwani
Hi All: I am currently using solr 1.4 as the search engine for my application. I am planning to add a search query log that will capture all the search queries (and more information like IP,user info,date time,etc). I understand I can easily do this on the application side capturing all th

How to set User.dir or CWD for Solr during Tomcat startup

2010-01-06 Thread Turner, Robbin J
Is there anyway to force the cwd that solr starts up in when using the standard startup scripts for tomcat? I'm working on solaris and using the SMF to start and stop tomcat sets the path to /root. I've been doing a bunch of googling and haven't seen if there is a parameter to set within Tomca

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-06 Thread Ryan McKinley
On Jan 6, 2010, at 3:48 PM, MitchK wrote: I have tested a lot and all the time I thought I set wrong options for my custom analyzer. Well, I have noticed that Solr isn't using ANY analyzer, filter or stemmer. It seems like it only stores the original input. The stored value is always t

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-06 Thread Erick Erickson
<<>> How do you know this? Because it's highly unlikely that SOLR is completely broken on that level. Erick On Wed, Jan 6, 2010 at 3:48 PM, MitchK wrote: > > I have tested a lot and all the time I thought I set wrong options for my > custom analyzer. > Well, I have noticed that Solr isn't

Re: Basic sentence parsing with the regex highlighter fragmenter

2010-01-06 Thread Erick Erickson
Hmmm, I'll have to defer to the highlighter experts here Erick On Wed, Jan 6, 2010 at 3:23 PM, Caleb Land wrote: > I've looked at the docs/source for WordDelimiterFilter, and I understand > what it does now. > > Here is my configuration: > > http://gist.github.com/270590 > > I've tried the

Re: Strange Behavior When Using CSVRequestHandler

2010-01-06 Thread Erick Erickson
I think the root of your problem is that unique fields should NOT be multivalued. See http://wiki.apache.org/solr/FieldOptionsByUseCase?highlight=(unique)|(key) In this case, since you're tokenizing, your "query" field is

RE: replication --> missing field data file

2010-01-06 Thread Giovanni Fernandez-Kincade
How can you tell when the backup is done? -Original Message- From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble Paul ??? ?? Sent: Wednesday, January 06, 2010 12:23 PM To: solr-user Subject: Re: replication --> missing field data file the index dir is in

No Analyzer, tokenizer or stemmer works at Solr

2010-01-06 Thread MitchK
I have tested a lot and all the time I thought I set wrong options for my custom analyzer. Well, I have noticed that Solr isn't using ANY analyzer, filter or stemmer. It seems like it only stores the original input. I am using the example-configuration of the current Solr 1.4 release. What's wron

Re: Basic sentence parsing with the regex highlighter fragmenter

2010-01-06 Thread Caleb Land
I've looked at the docs/source for WordDelimiterFilter, and I understand what it does now. Here is my configuration: http://gist.github.com/270590 I've tried the StandardTokenizerFactory instead of the WhitespaceTokenizerFactory, but I get the same problem as before, a the period from the previo

How to ignore term frequency >1? Field-specific Similarity class?

2010-01-06 Thread Andreas Schwarz
Hi, I want to modify scoring to ignore term frequency > 1. This is useful for short fields like titles or subjects, where the number of times a term appears does not correspond to relevancy. I found several discussions of this problem, and also an implementation that changes the Similarity clas

Re: Solr 1.4 - stats page slow

2010-01-06 Thread Stephen Weiss
Sorry, know I'm a little late in replying but the LukeRequestHandler tip was just what I needed! Thank you so much. -- Steve On Dec 25, 2009, at 2:03 AM, Chris Hostetter wrote: : I've noticed this as well, usually when working with a large field cache. I : haven't done in-depth analysis

Strange Behavior When Using CSVRequestHandler

2010-01-06 Thread danben
The problem: Not all of the documents that I expect to be indexed are showing up in the index. The background: I start off with an empty index based on a schema with a single field named 'query', marked as unique and using the following analyzer:

Re: Solr Cell - PDFs plus literal metadata - GET or POST ?

2010-01-06 Thread Ross
On Tue, Jan 5, 2010 at 2:25 PM, Giovanni Fernandez-Kincade wrote: > Really? Doesn't it have to be delimited differently, if both the file > contents and the document metadata will be part of the POST data? How does > Solr Cell tell the difference between the literals and the start of the file?

Re: replication --> missing field data file

2010-01-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
the index dir is in the name "index" others will be stored as index On Wed, Jan 6, 2010 at 10:31 PM, Giovanni Fernandez-Kincade wrote: > How can you differentiate between the backup and the normal index files? > > -Original Message- > From: noble.p...@gmail.com [mailto:noble.p...@gmail.co

RE: replication --> missing field data file

2010-01-06 Thread Giovanni Fernandez-Kincade
How can you differentiate between the backup and the normal index files? -Original Message- From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble Paul ??? ?? Sent: Wednesday, January 06, 2010 11:52 AM To: solr-user Subject: Re: replication --> missing field d

Re: solr and patch - SOLR-64 SOLR-792

2010-01-06 Thread Erik Hatcher
You probably aren't doing anything wrong, other than those patches are a bit out of date with trunk. You might have to fight through getting them current a bit, or wait until I or someone else can get to updating them. Erik On Jan 6, 2010, at 11:52 AM, Thibaut Lassalle wrote: hi

solr and patch - SOLR-64 SOLR-792

2010-01-06 Thread Thibaut Lassalle
hi, I tried to apply patches to solr-1.4 Here is the result javad...@javadev5:~/Java/apache-solr-1.4.0$ patch -p0 < SOLR-64.patch patching file src/java/org/apache/solr/schema/HierarchicalFacetField.java patching file src/common/org/apache/solr/common/params/FacetParams.java Hunk #1 FAILED at 10

Re: replication --> missing field data file

2010-01-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Jan 6, 2010 at 9:49 PM, Giovanni Fernandez-Kincade wrote: > I set up replication between 2 cores on one master and 2 cores on one slave. > Before doing this the master was working without issues, and I stopped all > indexing on the master. > > Now that replication has synced the index fi

Re: performance question

2010-01-06 Thread A. Steven Anderson
> You don't lose copyField capability with dynamic fields. You can copy > dynamic fields into a fixed field name like *_s => text or dynamic fields > into another dynamic field like *_s => *_t Ahhh...I missed that little detail. Nice! Ok, so there are no negatives to using dynamic fields then

Re: ord on TrieDateField always returning max

2010-01-06 Thread Yonik Seeley
On Wed, Jan 6, 2010 at 11:26 AM, Nagelberg, Kallin wrote: > Thanks Yonik, I was just looking at that actually. > Trying something like recip(ms(NOW,datetime),3.16e-11,1,1)^10  now. I'd also recommend looking into a multiplicative boost too - IMO they normally make more sense. http://wiki.apache.o

RE: ord on TrieDateField always returning max

2010-01-06 Thread Nagelberg, Kallin
Thanks Yonik, I was just looking at that actually. Trying something like recip(ms(NOW,datetime),3.16e-11,1,1)^10 now. My 'inspiration' for the ord method was actually the Solr 1.4 Enterprise Search server book. Page 126 has a section 'using reciprocals and rord with dates'. You should let those

Re: ord on TrieDateField always returning max

2010-01-06 Thread Yonik Seeley
Besides using up a lot more memory, ord() isn't even going to work for a field with multiple tokens indexed per value (like tdate). I'd recommend using a function on the date value itself. http://wiki.apache.org/solr/FunctionQuery#ms -Yonik http://www.lucidimagination.com On Wed, Jan 6, 2010 at

replication --> missing field data file

2010-01-06 Thread Giovanni Fernandez-Kincade
I set up replication between 2 cores on one master and 2 cores on one slave. Before doing this the master was working without issues, and I stopped all indexing on the master. Now that replication has synced the index files, an .FDT field is suddenly missing on both the master and the slave. Pr

ord on TrieDateField always returning max

2010-01-06 Thread Nagelberg, Kallin
Hi everyone, I've been trying to add a date based boost to my queries. I have a field like: When I look at the datetime field in the solr schema browser I can see that there are 9051 distinct dates. When I try to add the parameter to my query like: bf=ord(datetime) (on a dismax query) I alw

Re: performance question

2010-01-06 Thread Erik Hatcher
You don't lose copyField capability with dynamic fields. You can copy dynamic fields into a fixed field name like *_s => text or dynamic fields into another dynamic field like *_s => *_t Erik On Jan 6, 2010, at 9:35 AM, A. Steven Anderson wrote: Strictly speaking there is some ins

Re: performance question

2010-01-06 Thread A. Steven Anderson
> Strictly speaking there is some insignificant distinctions in performance > related to how a field name is resolved -- Grant alluded to this > earlier in this thread -- but it only comes into play when you actually > refer to that field by name and Solr has to "look them up" in the > metadata. S

Re: Basic sentence parsing with the regex highlighter fragmenter

2010-01-06 Thread Erick Erickson
Hmmm, the name WordDelimiterFilterFactory might be leading you astray. Its purpose isn't to break things up into "words" that have anything to do with grammatical rules. Rather, it's purpose is to break up strings of funky characters into searchable stuff. see: http://wiki.apache.org/solr/Analyzers

Yankee's Solr integration

2010-01-06 Thread Nicolas Kern
Hello everybody, I was wordering how did Yankee ( http://www.yankeegroup.com/search.do?searchType=advancedSearch) did to provide the possibility to Create Alerts, Save Searches, and generate a RSS Feed out of a custom search using Solr, do you have any idea ? Thanks a lot, Best regards & happy ne

schema.xml and Xinclude

2010-01-06 Thread Patrick Sauts
As in schema.xml are the same between all our indexes, I'd like to make them an XInclude so I tried : xmlns:xi="http://www.w3.org/2001/XInclude";> - - - My Syntax might not be correct ? Or it is not possible ? yet ? Thank you again for your time. Patrick.

Re: readOnly=true IndexReader

2010-01-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Jan 6, 2010 at 4:26 PM, Patrick Sauts wrote: > In the Wiki page : http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, > I've found > -Open the IndexReader with readOnly=true. This makes a big difference when > multiple threads are sharing the same reader, as it removes certain source

Re: readOnly=true IndexReader

2010-01-06 Thread Shalin Shekhar Mangar
On Wed, Jan 6, 2010 at 4:26 PM, Patrick Sauts wrote: > In the Wiki page : > http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, I've found > -Open the IndexReader with readOnly=true. This makes a big difference when > multiple threads are sharing the same reader, as it removes certain source

readOnly=true IndexReader

2010-01-06 Thread Patrick Sauts
In the Wiki page : http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, I've found -Open the IndexReader with readOnly=true. This makes a big difference when multiple threads are sharing the same reader, as it removes certain sources of thread contention. How to open the IndexReader with

Re: Rules engine and Solr

2010-01-06 Thread Avlesh Singh
Thanks for the revert, Ravi. I am currently working on some of kind rules in front > (application side) of our solr instance. These rules are more application > specific and are not general. Like deciding which fields to facet, which > fields to return in response, which fields to highlight, boost