Some thoughts
1) Very common words can have a large impact on performance. Use MoreLikeThis
to build an optimised boolean query. It costs very little for this to determine
how rare/common a term is compared to the cost of a query having to iterate
reader.termDocs for a common word. Try iterating
Good morning all (or good afternoon)
I used Lucene many times before, to search text in French Or English. All
worked fine :-)
But now I have a new challenge, I need to use Lucene with Khmer (Khmer is
the Cambodia’s language, it looks like Thai or Indian)
But it doesn’t work, my code is
Hello
>From the API:
"public class StandardAnalyzer
extends Analyzer
Filters StandardTokenizer with StandardFilter, LowerCaseFilter and
StopFilter, using a list of English stop words."
Are you sure that these filters won't filter your Khmer characters out?
Best,
czinkos
On Wed, Jan 24, 20
One more thing I forgot to tell you ...
It is working with dotnet lucene :)
-Message d'origine-
De : Zsolt Czinkos [mailto:[EMAIL PROTECTED]
Envoyé : Wednesday, January 24, 2007 5:35 PM
À : java-user@lucene.apache.org
Objet : Re: Lucene with Khmer ? (Language in cambodia)
Hello
>From t
Hi,
I would suggest to perform a Test with your Analyzers, something like:
>>StringReader reader = new StringReader(new String("your khmer text"));
>>TokenStream stream = analyzer.tokenStream("content", reader);
Iterate through the TokenStream and check wether the analyzed Tokens are
correct!
Luke is your friend. Use it to see what you have in your index.
On Jan 24, 2007, at 5:29 AM, Fournaux Nicolas wrote:
Good morning all (or good afternoon)
I used Lucene many times before, to search text in French Or
English. All
worked fine :-)
But now I have a new challenge, I need to
Hi,
I'm getting exception while retrieving 100th element id in hits.my sample
code is given below:
for(int i=0;
Hey,
I have a problem with Lucene and because I am little bit inexperienced,
I would like to ask you.
I have a database with ca. 2500 items in it. I store these items in a
RAMIndex and try to rebuild it every 10 minutes. I use the same
procedure like updating a FSDirectory - deleting and adding a
Marcel Morisse wrote:
I have a problem with Lucene and because I am little bit inexperienced,
I would like to ask you.
I have a database with ca. 2500 items in it. I store these items in a
RAMIndex and try to rebuild it every 10 minutes. I use the same
procedure like updating a FSDirectory - de
24 jan 2007 kl. 16.16 skrev Michael McCandless:
A couple ideas to verify / try:
You can also use a profiler to see what is hogging the resources. I
personally prefere JProfiler as it plugs right in to my IDE.
-
To unsubsc
Hi all,
Can you tell me the exact indexing algorithm used by Lucene. or give some
links to the documents that describe the algorithm used by lucene
Thanks in advance
--
Sairaj Sunil
Hi,
I build my index with the StandardAnalyzer and
two fields:
Field field= new Field("text", new
FileReader(fullPath));
field= new Field("filepath", fullPath, Store.YES,
Index.TOKENIZED);
Now, I want to highlight the search result.
First version is fine:
TokenStream stream
Hi Mukesh,
Are you by a chance deleting docs in that loop, using
the same reader as the one used the searcher?
If so, using a separate reader for delete would fix that.
Also see related discussion -
http://www.nabble.com/Iterating-hits-tf1129306.html#a2955956
Regards,
Doron
Mukesh Bhardwaj <[EM
Hi...
I am a Final Year Undergrad.My Final year project is about search engine for
XML Document..I am currently building this system using Lucene.
The example of XML element from an XML document :
--
This is my
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html
On 1/24/07, Sairaj Sunil <[EMAIL PROTECTED]> wrote:
Hi all,
Can you tell me the exact indexing algorithm used by Lucene. or give some
links to the documents that describe the algorithm used by lucene
Thanks in adva
Doron Cohen wrote:
Hi Mukesh,
Are you by a chance deleting docs in that loop, using
the same reader as the one used the searcher?
If so, using a separate reader for delete would fix that.
Also see related discussion -
http://www.nabble.com/Iterating-hits-tf1129306.html#a2955956
Also another t
maureen tanuwidjaja wrote:
Before implementing this search engine,I have designed to build the
index in such a way that every XML tag is converted using binary
value,in order to reduce the size index and perhaps for faster
searching.To illustrate:
article will be converted to 0
article/body
17 matches
Mail list logo