ICUTransformFilterFactory

2013-08-02 Thread Jochen Lienhard

Hello,

we have a problem with some special characters: for example æ


We are using the ICUTranformFilterFactory for indexing and searching.

We have some documents with urianae and with urianæ

If I search urainae so I find only the versions with urianae but not 
the urianæ

Only if I search urainae* I find both versions.

Is it possible (perhaps by special IDs in the 
ICUTransformFilterFactory), so that I can find all without an asterisk?


Greetings from Germany

Jochen Lienhard

--
Dr. rer. nat. Jochen Lienhard
Dezernat EDV

Albert-Ludwigs-Universität Freiburg
Universitätsbibliothek
Rempartstr. 10-16  | Postfach 1629
79098 Freiburg | 79016 Freiburg

Telefon: +49 761 203-3908
E-Mail: lienh...@ub.uni-freiburg.de
Internet: www.ub.uni-freiburg.de



Re: UnInverted multi-valued field

2013-06-20 Thread Jochen Lienhard

Hello,

well ... we have 5 multi-valued facet fields, so you had to wait 
sometimes up to one minute.


The old searcher blocks during this time.

@Toke Eskildsen: the example I posted was a very small update, usually 
there are more terms.


We are using Solr 3.6. I don't know if it will be faster with 4.x.

These are the configurations of our cache:

filterCache
  class=solr.FastLRUCache
  size=30
  initialSize=30
  autowarmCount=5/

queryResultCache
  class=solr.LRUCache
  size=10
  initialSize=10
  autowarmCount=5/

documentCache
  class=solr.LRUCache
  size=5
  initialSize=5
  autowarmCount=1/

We have 5 million document in our index.
@Roman: Do you think our autowarmCound should be larger?

Greetings

Jochen

Roman Chyla schrieb:

On Wed, Jun 19, 2013 at 5:30 AM, Jochen Lienhard 
lienh...@ub.uni-freiburg.de wrote:


Hi @all.

We have the problem that after an update the index takes to much time for
'warm up'.

We have some multivalued facet-fields and during the startup solr creates
the messages:

INFO: UnInverted multi-valued field {field=mt_facet,memSize=**
18753256,tindexSize=54,time=**170,phase1=156,nTerms=17,**
bigTerms=3,termInstances=**903276,uses=0}


In the solconfig we use the facet.method 'fc'.
We know, that the start-up with the method 'enum' is faster, but then the
searches are very slow.

How do you handle this problem?
Or have you any idea for optimizing the warm up?
Or what do you do after an update?


You probably know, but just in case... you may use autowarming; the
searcher will populate the cache and only after the warmup queries
finished, will it be exposed to the world. The old searcher continues to
handle requests in the meantime.

roman



Greetings

Jochen

--
Dr. rer. nat. Jochen Lienhard
Dezernat EDV

Albert-Ludwigs-Universität Freiburg
Universitätsbibliothek
Rempartstr. 10-16  | Postfach 1629
79098 Freiburg | 79016 Freiburg

Telefon: +49 761 203-3908
E-Mail: lienh...@ub.uni-freiburg.de
Internet: www.ub.uni-freiburg.de





--
Dr. rer. nat. Jochen Lienhard
Dezernat EDV

Albert-Ludwigs-Universität Freiburg
Universitätsbibliothek
Rempartstr. 10-16  | Postfach 1629
79098 Freiburg | 79016 Freiburg

Telefon: +49 761 203-3908
E-Mail: lienh...@ub.uni-freiburg.de
Internet: www.ub.uni-freiburg.de



UnInverted multi-valued field

2013-06-19 Thread Jochen Lienhard

Hi @all.

We have the problem that after an update the index takes to much time 
for 'warm up'.


We have some multivalued facet-fields and during the startup solr 
creates the messages:


INFO: UnInverted multi-valued field 
{field=mt_facet,memSize=18753256,tindexSize=54,time=170,phase1=156,nTerms=17,bigTerms=3,termInstances=903276,uses=0}



In the solconfig we use the facet.method 'fc'.
We know, that the start-up with the method 'enum' is faster, but then 
the searches are very slow.


How do you handle this problem?
Or have you any idea for optimizing the warm up?
Or what do you do after an update?

Greetings

Jochen

--
Dr. rer. nat. Jochen Lienhard
Dezernat EDV

Albert-Ludwigs-Universität Freiburg
Universitätsbibliothek
Rempartstr. 10-16  | Postfach 1629
79098 Freiburg | 79016 Freiburg

Telefon: +49 761 203-3908
E-Mail: lienh...@ub.uni-freiburg.de
Internet: www.ub.uni-freiburg.de



removing whitespaces in query

2013-03-07 Thread Jochen Lienhard

Hello,

we have indexed a field, where we have removed the whitespaces before 
the indexing.


For example:

50A91
Frei91\:9984

Now we want allow the users to search for:

50 A 91
Frei 91 \: 9984

Our idea was to add a PatternReplaceFilterFactory in the query analyzer 
to remove the whitespaces:
charFilter class=solr.PatternReplaceFilterFactory pattern=(\s+) 
replacement= replace=all/


But it does not work.

For normal queries - we are using vufind als frontend - we can remove 
the whitespace in the yaml part, but if
the user search with wildcards ... the yaml does not work ... so we hope 
to find a solution in solr.


We are using solr 3.6.

Thanks for ideas and hints.

Greetings from Germany

Jochen

--
Dr. rer. nat. Jochen Lienhard
Dezernat EDV

Albert-Ludwigs-Universität Freiburg
Universitätsbibliothek
Rempartstr. 10-16  | Postfach 1629
79098 Freiburg | 79016 Freiburg

Telefon: +49 761 203-3908
E-Mail: lienh...@ub.uni-freiburg.de
Internet: www.ub.uni-freiburg.de



Re: removing whitespaces in query

2013-03-07 Thread Jochen Lienhard

Hello Jilal and Oliver,

hmmm ... I don't know, how two fields can help.

The problem seems to be, that solr does not recognize the whitespace.

We are using following analyser:
analyzer type=query
charFilter class=solr.PatternReplaceCharFilterFactory pattern=Frei 
replacement=blubb replace=all/

tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
charFilter class=solr.MappingCharFilterFactory 
mapping=mapping-ISOLatin1Accent.txt/

filter class=solr.ICUFoldingFilterFactory/
filter class=solr.TrimFilterFactory/
/analyzer

It replaces in the Query: Frei 91 \: 9984 the Frei with blubb ... so it 
seems to work perfect.

But when we try to replace whitespace using \s nothing happens.

@Oliver: we dont want replace the : in the query ... it is a part of our 
callnumbers.


Greetings

Jochen

Oliver Schihin schrieb:

Hello Jochen

What are your tokenizers? I guess it should be 
'KeywordTokenizerFactory'. To fully understand, you might send the 
whole analyzer chain.


But there might be a simple mistake in your pattern, character classes 
are enclosed by square brackets. We do a replace of all 
non-alphanumeric characters like this:

**
filter class=solr.PatternReplaceFilterFactory
pattern=[^\w]+
replacement=
replace=all
/
**

If that helps.
Regards from Basel
Oliver

 Original-Nachricht 
Betreff: removing whitespaces in query
Von: Jochen Lienhard lienh...@ub.uni-freiburg.de
An: solr-user@lucene.apache.org
Datum: 07.03.2013 10:33


Hello,

we have indexed a field, where we have removed the whitespaces before 
the indexing.


For example:

50A91
Frei91\:9984

Now we want allow the users to search for:

50 A 91
Frei 91 \: 9984

Our idea was to add a PatternReplaceFilterFactory in the query 
analyzer to remove the

whitespaces:
charFilter class=solr.PatternReplaceFilterFactory pattern=(\s+) 
replacement=

replace=all/

But it does not work.

For normal queries - we are using vufind als frontend - we can remove 
the whitespace in

the yaml part, but if
the user search with wildcards ... the yaml does not work ... so we 
hope to find a

solution in solr.

We are using solr 3.6.

Thanks for ideas and hints.

Greetings from Germany

Jochen






--
Dr. rer. nat. Jochen Lienhard
Dezernat EDV

Albert-Ludwigs-Universität Freiburg
Universitätsbibliothek
Rempartstr. 10-16  | Postfach 1629
79098 Freiburg | 79016 Freiburg

Telefon: +49 761 203-3908
E-Mail: lienh...@ub.uni-freiburg.de
Internet: www.ub.uni-freiburg.de



solr multicore problem on SLES 11

2012-09-17 Thread Jochen Lienhard

Hello,

I have a problem with solr and multicores on SLES 11 SP 2.

I have 3 cores, each with more than 20 segments.
When I try to start the tomcat6, it can not start the CoreContainer.
Caused by: java.lang.OutOfMemoryError: Map failed
at sun.nio.ch.FileChannelImpl.map0(Native Method)

I read a lot about this problem, but I do not find the solution.

The strange problem is now:

It works fine under openSuSE 12.x, tomcat6, openjdk.

But the virtual maschine with SLES 11 SP 2, tomcat6, openjdk  it 
crashes.


Both tomcat/java configurations are the same.

Has anyboday a idea, how to solve this problem?

I have another SLES maschine with 5 core, but each has only 1 segment 
(very small index), and this maschine runs fine.


Greetings

Jochen

--
Dr. rer. nat. Jochen Lienhard
Dezernat EDV

Albert-Ludwigs-Universität Freiburg
Universitätsbibliothek
Rempartstr. 10-16  | Postfach 1629
79098 Freiburg | 79016 Freiburg

Telefon: +49 761 203-3908
E-Mail: lienh...@ub.uni-freiburg.de
Internet: www.ub.uni-freiburg.de




smime.p7s
Description: Kryptographische S/MIME-Signatur


Re: solr multicore problem on SLES 11

2012-09-17 Thread Jochen Lienhard

Great. Thanks.
That solves my problem.

Greetings

Jochen

André Widhani schrieb:

The first thing I would check is the virtual memory limit (ulimit -v, check 
this for the operating system user that runs Tomcat /Solr).

It should be set to unlimited, but this is as far as i remember not the 
default settings on SLES 11.

Since 3.1, Solr maps the index files to virtual memory. So if the size of your 
index files are larger than the allowed virtual memory, it may fail.

Regards,
André


Von: Jochen Lienhard [lienh...@ub.uni-freiburg.de]
Gesendet: Montag, 17. September 2012 09:17
An: solr-user@lucene.apache.org
Betreff: solr multicore problem on SLES 11

Hello,

I have a problem with solr and multicores on SLES 11 SP 2.

I have 3 cores, each with more than 20 segments.
When I try to start the tomcat6, it can not start the CoreContainer.
Caused by: java.lang.OutOfMemoryError: Map failed
  at sun.nio.ch.FileChannelImpl.map0(Native Method)

I read a lot about this problem, but I do not find the solution.

The strange problem is now:

It works fine under openSuSE 12.x, tomcat6, openjdk.

But the virtual maschine with SLES 11 SP 2, tomcat6, openjdk  it
crashes.

Both tomcat/java configurations are the same.

Has anyboday a idea, how to solve this problem?

I have another SLES maschine with 5 core, but each has only 1 segment
(very small index), and this maschine runs fine.

Greetings

Jochen

--
Dr. rer. nat. Jochen Lienhard
Dezernat EDV

Albert-Ludwigs-Universität Freiburg
Universitätsbibliothek
Rempartstr. 10-16  | Postfach 1629
79098 Freiburg | 79016 Freiburg

Telefon: +49 761 203-3908
E-Mail: lienh...@ub.uni-freiburg.de
Internet: www.ub.uni-freiburg.de






--
Dr. rer. nat. Jochen Lienhard
Dezernat EDV

Albert-Ludwigs-Universität Freiburg
Universitätsbibliothek
Rempartstr. 10-16  | Postfach 1629
79098 Freiburg | 79016 Freiburg

Telefon: +49 761 203-3908
E-Mail: lienh...@ub.uni-freiburg.de
Internet: www.ub.uni-freiburg.de




smime.p7s
Description: Kryptographische S/MIME-Signatur