subject:"upgraded from 2.9 to 3.x, problems. help\?"

Re: upgraded from 2.9 to 3.x, problems. help?

2011-07-04 Thread Chris Hostetter

: i recently upgraded al systems for indexing and searching to lucene/solr 3.1,
: and unfortunatly it seems theres a lot more changes under the hood than
: there used to be.

it wounds like you are saying you had a system that wsa working fine for
you, but when you tried to upgrade it stoped working.

: i have a java based indexer and a solr based searcher, on the java end for
...
: Analyzer an = new StandardAnalyzer(Version.LUCENE_31, nostopwords);

right off the bat, that line of code couldn't posisly have been in your
existing 2.9 code (Version.LUCENE_31 didn't existing in 2.9) and
instructs StandardAnalyzer to to some very basic things very
differnetly then they were dong in 2.9...

http://lucene.apache.org/java/3_1_0/api/all/org/apache/lucene/analysis/standard/StandardAnalyzer.html

I would start by setting that to Version.LUCENE_29 to tell
StandardAnalyzer that you want the same behavior as before.

Having said all of that -- the LUCENE_31 is considered better then the
LUCENE_29 behavior, so you should consider change that to get the benefits
-- but you need to understand your full analysis stack to do that.

: and for the solr end i have:

...you should also check if you added a luceneMatchVersion/ of LUCENE_31
to your solrconfig.xml -- if not do so so it's consistent with your
external java code.

generally speaking just having your indexer using an off the shelf
analyzer while your solr instead uses something like WordDelimiterFilter
isn't going to work well, you need to think about index time analysis and
query time anslysis in conjunction with eachother.

hang on, scratch that -- you may think you are using
WordDelimiterFilterFactory, but you are not...

: fieldType name=text class=solr.TextField positionIncrementGap=100
:
: filter class=solr.WordDelimiterFilterFactory
: generateWordParts=1 generateNumberParts=1 catenateWords=1
: catenateNumbers=1 catenateAll=0 splitOnCaseChange=0/
: analyzer class=org.apache.lucene.analysis.standard.StandardAnalyzer
: ignoreCase=true /
: /fieldType

...you can't just plop a filter/ tag in a fieldType/ like htat nad
have it mean something. filter/ can be used when you are declaring an
custom analyzer chain in the schema.xml, if you use analyzer class=...
/ you get a concrete analyzer that has hardcoded behavior.

so if you aren't getting matches, it's a straight up discrepency between
the LUCENE_31 and whatever seting you have in solrconfig.xml (which if you
didn't add to your existing config, is going to be a legacy default ...
2.4 or 2.9 ... i can't remember)

-Hoss

Re: upgraded from 2.9 to 3.x, problems. help?

2011-07-03 Thread Erick Erickson

Can you post the results of adding debugQuery=on to your two versions? And
have you re-indexed or not?

Best
Erick
On Jul 1, 2011 12:31 PM, dhastings dhasti...@wshein.com wrote:
 i guess what im asking is how to set up solr/lucene to find
 yale l.j.
 yale l. j.
 yale l j
 as all the same thing.

 --
 View this message in context:
http://lucene.472066.n3.nabble.com/upgraded-from-2-9-to-3-x-problems-help-tp3129348p3129520.html
 Sent from the Solr - User mailing list archive at Nabble.com.

upgraded from 2.9 to 3.x, problems. help?

2011-07-01 Thread dhastings

i recently upgraded al systems for indexing and searching to lucene/solr 3.1,
and unfortunatly it seems theres a lot more changes under the hood than
there used to be.

i have a java based indexer and a solr based searcher, on the java end for
the indexing this is what i have:

   Set nostopwords = new HashSet(); 
nostopwords.add(needtoindexstopwords);
Analyzer an = new StandardAnalyzer(Version.LUCENE_31, nostopwords);
 writer
= new IndexWriter(fsDir,an,MaxFieldLength.UNLIMITED);

doc.add(new Field(text, contents, Field.Store.NO, Field.Index.ANALYZED));

and for the solr end i have:
 fieldType name=text class=solr.TextField positionIncrementGap=100
 
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=0/
  analyzer class=org.apache.lucene.analysis.standard.StandardAnalyzer
ignoreCase=true /
/fieldType


and it seems to be working well enough, EXCEPT i somehow lost matching
against strings like:
97 Yale L.J. 1493 
which with 2.9 would give me 753 results in my data, and now 3.1 gives me
105

is there something i can change to the indexer to be able to understand what
used to be default behavior with the standard analyzer, or is this something
with my solr schema against the data?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/upgraded-from-2-9-to-3-x-problems-help-tp3129348p3129348.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: upgraded from 2.9 to 3.x, problems. help?

2011-07-01 Thread dhastings

i guess what im asking is how to set up solr/lucene to find 
yale l.j.
yale l. j.
yale l j
as all the same thing.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/upgraded-from-2-9-to-3-x-problems-help-tp3129348p3129520.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: upgraded from 2.9 to 3.x, problems. help?

Re: upgraded from 2.9 to 3.x, problems. help?

upgraded from 2.9 to 3.x, problems. help?

Re: upgraded from 2.9 to 3.x, problems. help?

4 matches

Site Navigation

Mail list logo

Footer information