Re: Solr 3.1 / Java 1.5: Exception regarding analyzer implementation

2011-05-09 Thread Martin Jansen
On 09.05.11 11:04, Martin Jansen wrote:
> I just attempted to set up an instance of Solr 3.1 in Tomcat 5.5 running
> in Java 1.5.  It fails with the following exception on start-up:
> 
>> java.lang.AssertionError: Analyzer implementation classes or at least their 
>> tokenStream() and reusableTokenStream() implementations must be final at 
>> org.apache.lucene.analysis.Analyzer.assertFinal(Analyzer.java:57)

In the meantime I solved the issue by installing Java 1.6.  Works
without a problem now, but I'm wondering if Solr 3.1 is intentionally
incompatible to Java 1.5 or if if happened by mistake.

Martin


Solr 3.1 / Java 1.5: Exception regarding analyzer implementation

2011-05-09 Thread Martin Jansen
I just attempted to set up an instance of Solr 3.1 in Tomcat 5.5 running
in Java 1.5.  It fails with the following exception on start-up:

> java.lang.AssertionError: Analyzer implementation classes or at least their 
> tokenStream() and reusableTokenStream() implementations must be final at 
> org.apache.lucene.analysis.Analyzer.assertFinal(Analyzer.java:57)

The exact same configuration works like a charm on another machine with
Java 1.6 again using Tomcat 5.5.  Has anyone else run into this issue?
Is Solr 3.1 not compatible to Java 1.5 anymore?

The query analyzer where the exceptions seems to stem from looks like this:

>   
> 
> 
> 
> 
>  generateWordParts="1"
> generateNumberParts="1"
> catenateWords="0"
> catenateNumbers="0"
> catenateAll="0"
> splitOnCaseChange="1"
> preserveOriginal="1" 
> stemEnglishPossessive="0" 
> splitOnNumerics="0" 
> />
> 
>  minShingleSize="2" 
> maxShingleSize="5" 
> outputUnigrams="true"
> />
>   

Best,
- Martin


Re: Indexing all permutations of words from the input

2011-01-20 Thread Martin Jansen
On 20.01.11 22:19, Jonathan Rochkind wrote:
> On 1/20/2011 4:03 PM, Martin Jansen wrote:
>> I'm looking for an  configuration for Solr 1.4 that
>> accomplishes the following:
>>
>> Given the input "abc xyz foo" I would like to add at least the following
>> token combinations to the index:
>>
>> abc
>> abc xyz
>> abc xyz foo
>> abc foo
>> xyz
>> xyz foo
>> foo
>>
> Why do you want to do this, what is it meant to accomplish?  There might be a 
> better way to accomplish what it is you are trying to do; I can't think of 
> anything (which doesn't mean it doesn't exist) that what you're actually 
> trying to do would be required in order to do.  What sorts of queries do you 
> intend to serve with this setup?

I'm in the process of setting up an index for term suggestion. In my use
case people should get the suggestion "abc foo" for the search query
"abc fo" and under the assumption that "abc xyz foo" has been submitted
to the index.

My current plan is to use TermsComponent with the terms.prefix=
parameter for this, because it seems to be pretty efficient and I get
things like correct sorting for free.

I assume there is a better way for achieving this then?

- Martin


Indexing all permutations of words from the input

2011-01-20 Thread Martin Jansen
Hey there,

I'm looking for an  configuration for Solr 1.4 that
accomplishes the following:

Given the input "abc xyz foo" I would like to add at least the following
token combinations to the index:

abc
abc xyz
abc xyz foo
abc foo
xyz
xyz foo
foo

A WhitespaceTokenizer combined with a ShingleFilter will take me there
to some extent, but won't e.g. add "abc foo" to the index.  Is there a
way to do this?

- Martin