Porter stemmer is not only agressive, it is ugly, too. The generated
code is too old, too few object centric and should be too slow.
If your kstem compile with java 1.4, why don't you suggest it to lucene
core?
M.
Wagner,Harry a écrit :
Hi HH,
Here's a note I sent Solr-dev a while back:
Robert Haschart [EMAIL PROTECTED] wrote:
To answer your questions: I completely deleted the index each time
before retesting. and the java command as shown by ps does show -Xbatch.
The program is running on:
uname -a
Linux lab8.betech.virginia.edu 2.6.18-53.1.14.el5 #1 SMP Tue Feb
hi ,
There is this new patch which implements these features. I shall
update the wiki with the documentation
I guess we do not need to be too worried about the memory consumption.
A few MB of memory should be fine (unless your are using a file which
is in 10's of MB ). Consider using
Thanks Ryan. I just opened SOLR-546. Please let me know if I can provide
further help. Cheers! h
-Original Message-
From: Ryan McKinley [mailto:[EMAIL PROTECTED]
Sent: Monday, April 21, 2008 2:33 PM
To: solr-user@lucene.apache.org
Subject: Re: better stemming engine than Porter?
Hey-
On Apr 21, 2008, at 5:02 PM, Francisco Sanmartin wrote:
Is it possible to boost the query that MoreLikeThis returns before
sending it to Solr? I mean, technically is possible, because you
can add a factor to the whole query but...does it make sense?
(Remember that MoreLikeThis can already
I know that only one query of that type does not change anything. But
when it's two or more with different boosts, i hope it does. Here is the
situation:
My docs have Title and Description. What I want to do is to give
more relevancy to the morelikethis on the title than on the description.
So
No, the MLT feature does not have that kind of field-specific
boosting capability. It sounds like it could be a useful enhancement
though. Of course you do get boosts for interesting terms already,
but maybe having an additional field-specific boost would be a nice
touch too.
It should help to weight the terms with their frequency in the
original document. That will distinguish between two documents
with the same terms, but different focus.
wunder
On 4/22/08 7:46 AM, Erik Hatcher [EMAIL PROTECTED] wrote:
No, the MLT feature does not have that kind of field-specific
The kind usage we have in our seaching the contents news we need a more
sofisticated query language.
currently the solr query language is not enough for our needs.
I understand it is possible to add our own customized query parse to the
system, but I was wondering if anybody have done that and if
Hi Wagner,
Thanks for the intro of KStem! I quickly scanned the original paper on
KStem by Robert Krovetz but could not find any timing comparison data on
KStem and Porter stem. I wonder how slow/fast Kstem is compared to
Porter stem based on your use in your application?
Jay
Wagner,Harry
Hi Jay,
I did not do a timing comparison either, but any change in performance after
switching to Kstem was not noticeable. Cheers... h
-Original Message-
From: Jay [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 22, 2008 12:26 PM
To: solr-user@lucene.apache.org
Subject: Re: better
On 19-Apr-08, at 3:02 AM, Christian Wittern wrote:
Mike Klaas wrote:
Fragments are generated independently from matching (I realize this
isn't an ideal algorithm).
So it could be that the match is not part of the fragment? This
sounds a bit strange. Is there a way to make sure the
Hi,
I'm (still) seeking more advice on this deployment issue which is to use
org.apache.log4j instead of java.util.logging. I'm not seeking re-starting
any discussion on solr4j/commons/log4j/jul respective benefits; I'm seeking
a way to bridge jul to log4j with the minimum specific per-container
Mike Klaas wrote:
On 19-Apr-08, at 3:02 AM, Christian Wittern wrote:
So it could be that the match is not part of the fragment? This
sounds a bit strange. Is there a way to make sure the fragment
contains the match other than returning the whole field and do the
fragmenting myself?
[...]
I actually doubt Porter's is slow. From what I recall, it's a bunch of simple
if/elses.
KStem can't get added to Lucene core due to its license (search Lucene JIRA for
an issue that covered this several years ago).
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
-
I'm using the Spellchecker handler but am a little confused. The docs say to
run the cmd=rebuild when building the first time. Do I need to supply a q
param with that cmd=rebuild? The examples show a url with the q param set
while rebuilding, but the main section on the cmd param doesn't say much
16 matches
Mail list logo