of the
abbreviations)
Regards
Uwe Goetzke
-Ursprüngliche Nachricht-
Von: Cedric Ho [mailto:[EMAIL PROTECTED]
Gesendet: Samstag, 10. November 2007 02:28
An: java-user@lucene.apache.org
Betreff: - Re: Chinese Segmentation with Phase Query
On Nov 10, 2007 2:08 AM, Steven A Rowe [EMAIL PROTECTED] wrote
This week I switched the lucene library version on one customer system.
The indexing speed went down from 46m32s to 16m20s for the complete task
including optimisation. Great Job!
We index product catalogs from several suppliers, in this case around
56.000 product groups and 360.000 products
Hi,
I do not yet fully understand what you want to achieve.
You want to spread the index split by keywords to reduce the time to distribute
indexes?
And you want the distribute queries to the nodes based on the same split
mechanism?
You have several nodes with different kind of documents.
else and this is
the reason the total process of indexing to be not so reasonably faster.
Best Regards,
Ivan
Uwe Goetzke wrote:
This week I switched the lucene library version on one customer system.
The indexing speed went down from 46m32s to 16m20s for the complete task
including
Hi Cuong ,
I have written a TolerantPhraseScorer starting with the code from PhraseScorer
but I think I have modified it to much to be generally useful. We use it with
bigramm clusters and therefore does not need the slop factor for scoring but
have a tolerance factor (depending on the length
the NGramAnalyzer?
-jake
On 3/24/08, Uwe Goetzke [EMAIL PROTECTED] wrote:
Hi Ivan,
No, we do not use StandardAnalyser or StandardTokenizer.
Most data is processed by
fTextTokenStream = result = new
org.apache.lucene.analysis.WhitespaceTokenizer(reader);
result = new ISOLatin2AccentFilter
?
Thanks!
Jay
Uwe Goetzke wrote:
Hi Ivan,
No, we do not use StandardAnalyser or StandardTokenizer.
Most data is processed by
fTextTokenStream = result = new
org.apache.lucene.analysis.WhitespaceTokenizer(reader);
result = new ISOLatin2AccentFilter(result
;
}
}
return output.toString();
}
}
Regards
Uwe Goetzke
Leiter Produktentwicklung
Healy Hudson GmbH
Procurement Retail Solutions
-Ursprüngliche Nachricht-
Von: Sascha Fahl [mailto:[EMAIL PROTECTED]
Gesendet: Dienstag, 18. November
Hello Ganesh,
What about making a seperate index for each day, get your analysis and merge
thereafter that index.
I am not sure but I think this might work. Use MultiSearcher for the search.
Regards
Uwe Goetzke
-Ursprüngliche Nachricht-
Von: Ganesh [mailto:emailg...@yahoo.co.in
GetPropertyAction(file.separator))).charAt(0);
Which sounds more than strange to me...
Any idea?
Regards
Uwe Goetzke
---
Healy Hudson GmbH - D-55252 Mainz Kastel
Geschaftsfuhrer Christian Konhauser - Amtsgericht Wiesbaden HRB
Ups, sorry 2.4.1
Thx
Uwe Goetzke
-Ursprüngliche Nachricht-
Von: Uwe Schindler [mailto:u...@thetaphi.de]
Gesendet: Montag, 31. August 2009 17:42
An: java-user@lucene.apache.org
Betreff: RE: MergePolicy$MergeException because of FileNotFoundException
because wrong path to index-file
Regarding Part3:
Data quality
For our search domain (catalog products) we face very often the problem that
the search data is full of acronyms and abbreviations like:
cable,nym-j,pvc,3x2.5mm²
or
dvd-/cd-/usb-carradio,4x50W,divx,bl
We solved this by a combination of normalization for better data
Index all into a directory and determine the size of all files in it.
From http://lucene.apache.org/java/3_0_1/fileformats.html
Starting with Lucene 2.3, doc store files (stored field values and term
vectors) can be shared in a single set of files for more than one segment. When
compound file
I got stuck on a problem using NumericFields using with lucene 2.9.3
I add values to the document by
doc.add(new NumericField(minprice).setDoubleValue(net_price));
If I want to search with a sorter for this field, I get this error:
java.lang.NumberFormatException: Invalid shift
should reindex the whole stuff or at
least try to optimize the index to get rid of deleted documents and the
terms.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From: Uwe Goetzke [mailto:uwe.goet...@veenion.de
15 matches
Mail list logo