From: Jason van Zyl [mailto:[EMAIL PROTECTED]]
If you can build the javadoc than create a link on the site
for it and that
should suffice. We have no central place for generated javadoc.
Okay. The first step according to the HOWTO is to logon to the web server,
but I don't know how to do
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Is there any limit for the size of a field? I've tried to index a
document with a field (UnStored type) of something like 8 chars
and I've noticed that the words which are in the end of that field
aren't indexed... If there is a
Here's one vote for putting locks in a separate directory. Anyone dislike
that?
Doug
-Original Message-
From: Snyder, David [mailto:[EMAIL PROTECTED]]
Sent: Friday, October 05, 2001 11:23 AM
To: Doug Cutting
Subject: RE: Lucene 1.2 and directory write permissions?
The lock file
From: Snyder, David [mailto:[EMAIL PROTECTED]]
I think splitting out the locks into a separate directory
would solve our problem... Do you think this is something
very difficult to do?
No, it will be easy.
our indexes (we
use many with the multisearcher) are about 13 gigs now and
Dmitry,
Wow! This looks great!
I was preparing a response to your questions of last weekend, but it seems
like you figured out a lot of it on your own. I've attached that response
anyway, in case you're still interested.
Once we get 1.2 out the door I'd like to make you a committer
Brian,
Do you know what's going on here? I have not yet had time to look at this.
If you don't have time, and no one else volunteers, then I will look into
it. I would like fix this for the 1.2 final release, if the change required
is not major.
Doug
-Original Message-
From: [EMAIL
I have added a new file in the top-level of Lucene named 'CHANGES.txt'.
This contains a list of user-visibile changes. I've filled in some
historical information.
Committers: please add an entry at the top of this file when you make
changes.
This will serve as release notes.
Thanks,
Doug
--
Can someone with access to Adobe Illustrator please help Lucene?
To build Lucene's new home at Apache Jakarta we need to extract the Lucene
logo from the original Lucene artwork and save it as a set of GIFs. These
should contain just the script of the word Lucene. We need a 300 pixel
wide
From: Jon Stevens [mailto:[EMAIL PROTECTED]]
on 9/24/01 2:52 PM, Doug Cutting [EMAIL PROTECTED] wrote:
I think we should 'cvs rm' all of the files, and change the
README to point to Jakarta. Does that sound reasonable?
Don't do that. It will serve as the repo for the old history
Thanks to all who responded.
Matt Tucker was first and did a fine job, so I'll be using his.
Thanks again,
Doug
From: Ted Husted [mailto:[EMAIL PROTECTED]]
Any Committer with site karma can do this. Right now, that
includes me
;-) I believe Brian would be able to grant the same to you if you want
to try it yourself.
I'm happy to let you do it.
Can you put up the javadoc now? That will fix the
From: Doug Cutting [mailto:[EMAIL PROTECTED]]
Okay, so maybe we should just start with the nightlies.
Actually, it would be nice to have at least a milestone release when we go
public. What's involved in being the release manager? I'm happy to write
up some release notes, if that would
Your analysis looks good to me.
I think it would be simpler, if a bit less optimized, to just make
SegmentsReader.numDocs() and SegmentsReader.delete() synchronized methods.
Does that sound like a reasonable fix to you?
Thanks for spotting this.
As for closing, your analysis also sounds
From: Dmitry Serebrennikov [mailto:[EMAIL PROTECTED]]
But I was looking again at the MultiSearcher after reading
through the SegmentsReader (and friends) and I was
thinking if it wouldn't be better to write MultiSearcher
not in terms of searching over multiple Searchers, but as
an
From: Dmitry Serebrennikov [mailto:[EMAIL PROTECTED]]
Doug, thanks for posting these. I may end up going in this
direction in
the next few days and will use this as a blueprint. Maybe I'll end up
putting in the first pass implementation and then you can
later further
tune it when
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Unicode is 16 bits.
Unicode is currently defined as having up to 2^31 positions, although
the current plan is for somewhere between 2^20 and 2^21 characters.
(2^16 characters was the old Unicode standard - dropped when someone
pointed
From: Dmitry Serebrennikov [mailto:[EMAIL PROTECTED]]
The latest build.xml works fine with Ant and without the batch files,
but it has a classpath statement that fails if anakia is not
present.
If I remove anakia, then it only fails for me when I try to build the docs
target, which is
investigate.
Commenting it out works fine, but it would be better, if we
didn't have
to modify this file for different compilation scenarios.
Doug Cutting wrote:
From: Dmitry Serebrennikov [mailto:[EMAIL PROTECTED]]
The latest build.xml works fine with Ant and without the
batch files
From: Brian Goetz [mailto:[EMAIL PROTECTED]]
I like the idea of being able to add fields to a Document after the
Document is indexed. Then, for documents with a long 'body' and short
metadata fields, you could process the body through an InputStream
adapter, which would, as a side effect,
From: Andrew C. Oliver [mailto:[EMAIL PROTECTED]]
We've implemented an event based
system for reading documents (so you register for what you care about
and then kick it off and it throws events to listeners as it runs into
them). Not sure if there is a clean way to graft those ideas onto
From: Andrew C. Oliver [mailto:[EMAIL PROTECTED]]
I believe you could submit this request to [EMAIL PROTECTED],
or perhaps Ted could give us some direction on that.
Ted's the one who asked me to remove it.
Doug
--
To unsubscribe, e-mail: mailto:[EMAIL PROTECTED]
For additional commands,
From: Andrew C. Oliver [mailto:[EMAIL PROTECTED]]
Would the demos be pre-compiled in the distribution?
I think they are currently. If they're not, they should be.
As for packaging it in org.apache.lucene.demo in addition to
keeping it
in a separate jar (and hence under demo instead of
Currently information on how to build Lucene is in the BUILD.txt file that
is in CVS and distributed with the source distribution, but not the binary
distribution. Is this document inaccurate or inadequate? Should we improve
it or replace it?
In any case, Lucene build instructions and the
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Can someone who has privileges, please update the website html.
Done.
If I also have privileges, please let me know how to update the site.
ssh www.apache.org
cd /www/jakarta.apache.org/lucene
cvs update -d
Doug
--
To unsubscribe,
I just made a new release, 1.2RC3, based on the current CVS:
http://jakarta.apache.org/builds/jakarta-lucene/release/v1.2-rc3/
I did some simple tests, and things look good to me. Does anyone see a
reason not to announce this to lucene-user? Hopefully we can turn this into
a 1.2 final
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
Another solution is to make a symbolic link (shortcut?) from
./lib/JavaCC.zip to the real JavaCC.zip, which is what I just did.
That works so long as you're not building distributions. The 'dist' and
'dist-src' targets bundle in the
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
Does anyone see a problem with moving from Junit 3.5 to Junit 3.7?
+1
Doug
--
To unsubscribe, e-mail: mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]
From: Dmitry Serebrennikov [mailto:[EMAIL PROTECTED]]
It seems that either a) deletes should be write-through, or
b) deletes should
be done by the writer, or c) writer should not optimize
non-RAM segments unless
asked to. As a client, I like option b) the best, though,
this is not
From: Dmitry Serebrennikov [mailto:[EMAIL PROTECTED]]
If there is one user performing additions and deletions,
then the two
can be ordered. But if an application is such that it allows multiple
people initiate index updates of various kinds, it may be
much harder to
order additions
I think this is a great idea. Lucene badly needs this sort of high-level
interface.
As far as other folks' concern about keeping Lucene a library and not making
it an application, I agree, but I also assumed that's what you meant to do.
All of this can be layered on top of the existing API.
Thanks for making all these cleanups, Otis!
One comment:
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, February 13, 2002 5:47 PM
To: [EMAIL PROTECTED]
Subject: cvs commit: jakarta-lucene/src/java/org/apache/lucene/store
FSDirectory.java
[ ... ]
+ * Examples of
From: Halácsy Péter [mailto:[EMAIL PROTECTED]]
I'd like to index documents that are described by keywords.
One document can have zero or more keywords and a keyword can
be related to one ore more documents. Assume two keywords:
human computer interaction
computer science
If I add
From: Julien Nioche [mailto:[EMAIL PROTECTED]]
By the way, I was wondering if there is any Analyzer that
uses the following
constructor
public Token(String text, int start, int end, String typ) ?
StandardTokenizer uses Token's type field to communicate with
StandardFilter, which does
From: Les Hughes [mailto:[EMAIL PROTECTED]]
Reading the servlet spec again it says that calls such as
servletcontext.getRealPath() will *possibly* return null if
the content is
being served from a war as opposed the physical path on disk
- I'm informed
that weblogic actually returns
From: Halácsy Péter [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, February 19, 2002 8:49 AM
To: Lucene Developers List; Lucene Users List
Subject: RE: Lucene Query Structure
The queryParser of Lucene implies OR logic if no operator
found in the query, doesn't it?
Yes.
How could I modify
From: Ype Kingma [mailto:[EMAIL PROTECTED]]
I happen to be familiar with a (boolean) query language that
only allows
proximity operators between or like queries (including prefix terms).
This case is not too difficult to explain and not confusing at all.
It might be not too difficult to
From: Joshua O'Madadhain [mailto:[EMAIL PROTECTED]]
Okay, I think I finally understand how this is working. If we express
the semantics of (required, prohibited) in terms of their
impact on the score for a document D and query q, we get:
(true, false): if q is not satisfied by D,
From: Eric Fixler [mailto:[EMAIL PROTECTED]]
I'm wondering if there's a design reason why HitCollector is
an abstract class, rather than an interface.
I don't recall my thinking, if any, when I did this.
An interface is more flexible, since it can be a mix-in, but calls to
interfaces are
From: Dmitry Serebrennikov [mailto:[EMAIL PROTECTED]]
I know at least in my case, I have a much more extensive list of stop
words and they are simply read from a file into an array and
then passed
to the existing class. Would this approach work in your case?
I think that serious
From: Spencer, Dave [mailto:[EMAIL PROTECTED]]
Proposed solution is to change a couple of decls in Scorer and Query:
Scorer.java
make score() public
Query.java
make all methods public or protected (normalize,
sumOfSquaredWeights,prepare)
I'm a little hesitant.
From: Daniel Calvo [mailto:[EMAIL PROTECTED]]
This issue has been discussed some time ago and Erik Hatcher
sent a patch proposing the definition of all properties in build.xml
and letting users customize their environment (javacc.home,
etc.) in build.properties. IMO, this is the best
It would be good to also know the average size of your documents, the size
of your index, and the amount of RAM required for each benchmark.
Lucene currently indexes using very little memory. You're making it faster
by using more RAM. In particular you're able to get a 10% speedup (58
versus
From: Che Dong [mailto:[EMAIL PROTECTED]]
here is example for sort result with score multi by rank field;
scorer.score(new HitCollector() {
public final void collect(int doc, float score) {
[ ... ]
String rank = reader.doc(doc).getField(rank).stringValue();
The problem is that
FYI, I will be on vacation, without email access, starting tomorrow through
March 19th. Please don't expect any responses from me about Lucene during
this time.
Sorry for the SPAM.
Doug
--
To unsubscribe, e-mail: mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL
[EMAIL PROTECTED] wrote:
Otis,
You can remove the .lock file and try re-indexing or
continuing
indexing where you left off.
I am not sure about the corrupt index. I have never seen it
happen,
and I believe I recall reading some messages from Doug Cutting
From: Peter Carlson [mailto:[EMAIL PROTECTED]]
I recently updated the contributions page (last night), but I
need Doug to update the site.
I just updated the site. We should get you the privleges required to do
this.
Once you have the privledges, all that you do is:
ssh www.apache.org
I've lost track of just where we are with the 1.2 release.
Are there outstanding bugs that we intend to fix before the 1.2 release?
There have been only a few minor patches since RC4. Should we make an RC5
or just go ahead with the final release?
Doug
--
To unsubscribe, e-mail:
Brian Goetz wrote:
I still want to see Date and Number fields supported as basic types in
the Field class, rather than use a String in this magic date format.
The first part of this is easy: just add new Field constuctor methods
that take Date and number parameters, e.g.:
[EMAIL PROTECTED] wrote:
[ ... ]
+ private static final boolean DISABLE_LOCKS = Boolean.getBoolean(disableLocks);
[ ... ]
public boolean obtain() throws IOException {
- if (Constants.JAVA_1_1) return true;// locks disabled in jdk 1.1
+ if
Jon Scott Stevens wrote:
Adding support to Lucene for Nilsimsa seems like a cool idea...
http://ixazon.dynip.com/~cmeclax/nilsimsa.html
The index would be the hash and one could use Lucene to rank searches based
on the Nilsimsa rating of the results...
Nilsimsa employs a very different
Halcsy Pter wrote:
Could you please make a proposal to the lucene-dev list of
which methods and
classes should be made public or protected or non-final, and
what documentation
should be added?
1. all package-protected abstract method of Searcher should be made to protected
abstract
These
I just added a remote searchable implementation.
See src/test/org/apache/lucene/search/TestRemoteSearchable.java for an
example of how this can be used. This is the first RMI code I've
written, so please tell me if I've got something wrong.
Doug
--
To unsubscribe, e-mail: mailto:[EMAIL
[EMAIL PROTECTED] wrote:
Log:
msg.txt
Oops. That log entry was supposed to read:
Added support for boosting the score of documents and fields via the
new methods Document.setBoost(float) and Field.setBoost(float).
Note: This changes the encoding of an indexed value. Indexes
document scoring, so that a user
can alter any part of the formula without altering Lucene's core code.
Enjoy!
Doug
Original Message
Subject: Re: cvs commit:
jakarta-lucene/src/test/org/apache/lucene/search TestDocBoost.java
Date: Mon, 29 Jul 2002 12:14:22 -0700
From: Doug Cutting
Mike Tinnes wrote:
I've been working on tying in a PageRank algo to
my web crawler using lucene and have a few problems. If I don't know the
boost factor until AFTER the crawl is it possible to still set the boost?
Why not: (1) crawl, saving pages to disk; (2) analyze links and compute
Karl von Randow wrote:
The org.apache.lucene.search.BooleanClause is not currently Serializable, I
would like to propose that it is made serializable.
You're right, it should be. This is a bug. When I recently added
support for remote searching I tested only TermQuery.
I fixed this and
Christian Ullenboom wrote:
I take a look at the StopFilter/StopAnalyser, the BitVector, and
PorterStemmer and I would like to optimize the code. What is the best
way to contribute?
Please submit contributions to [EMAIL PROTECTED] If the
changes are small, a diff file is appropriate. For
This looks great to me.
Does anyone object to adding this to Lucene as the package
org.apache.lucene.analysis.ru?
Doug
--
To unsubscribe, e-mail: mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]
äÍÉÔÒÉÊ ï×ÓÑÎËÏ wrote:
http://www.halyava.ru/do/org.apache.lucene.analysis.zip
This looks great! If I understand correctly, it can be used to quickly
build stemmers for lots of languages. For example, the following page
lists the location of ispell dictionaries for over 30 languages!
Clemens Marschner wrote:
I need to perform an AND query on two fields and weight the results
according to in which fields the results came from. That is, I would need
something like
(field1^2 OR field2^1):(+token1 +token2 +token3)
This means that _all_ of the tokens _have_ to occur in
+1
Che Dong wrote:
Attached StandardTokenizer.jj with Sigram Based east
asia language support:
tested under Windows and GNU/Linux
Just treat different UnicodeBlock with different word
segment method.
Hope in the future released we can add more language
support in StandardTokenizer.jj step by
Did my suggestion not make sense?
I think we can make everyone happy here. By adding a parameter to the
existing query parser we can:
1. Keep things so that the default behaviour is not to permit initial
wildcards.
2. Make it so that developers who want to permit initial wildcards
can
Can you please submit a complete, self-contained test program that
demonstrates the problem? That will make it much easier for someone to
debug and fix it.
Thanks,
Doug
Rasik Pandey wrote:
Hello,
I am getting the following exception when searching using a
MultiSearcher and the first
This fixes the query parser, but, unfortunately, the problem is deeper.
BooleanQuery does not implement boosting. This could be fixed too,
but, for now, the easiest thing to do is simply to boost each term
within the boolean query.
Doug
--
To unsubscribe, e-mail: mailto:[EMAIL
Che Dong wrote:
1. custom sorting beside default score sorting: make docID alias one field you need
output sorting
solved by sort data before indexing(example sorted by field PostDate), so docID can
be an alias to the sort field. if we make hitCollector
sort with docID or 1/docID or even
Clemens Marschner wrote:
I want to perform some rewriting rules on the queries I get. The best way to
do that is to edit the parse tree.
However, the Query classes do not contain any methods for reading out or
altering their contents or to clone them.
Is there any reason for that? Or is
Lee Mallabone wrote:
Should I update the patch for now so that BooleanQuery.setBoost() just
calls setBoost() on all its clauses?
That only works if you call setBoost() after all of the clauses have
been added, which is a little fragile. So you'd also need to boost new
clauses as they're
Otis,
I really appreciate all of the work you do on Lucene. However sometimes
I have to disagree.
[EMAIL PROTECTED] wrote:
- Added FIXME/TODO tags about things to document.
While documentation in a package private class is nice, it is not an
absolute requirement. So I don't think this
Otis Gospodnetic wrote:
Sorry about that, I'll put the old file back.
Regarding javadocs - I simply wanted a way to see the Javadocs of some
classes (FileInfos, I believe it was) that were not visible.
Maybe we should add another target: javadocs-internal or something.
That would be good
Otis Gospodnetic wrote:
--- Doug Cutting [EMAIL PROTECTED] wrote:
Maybe we should add another target: javadocs-internal or something.
That would be good encouragement to add javadoc comments to internal
classes.
Sounds good to me.
I think it would encourage documentation of internals
Doug Cutting wrote:
I've attached this in Open Office format and as HTML. The HTML
conversion is not great, but it's readable. Perhaps I should maintain
this in HTML instead of Open Office, since it contains no diagrams...
For some reason the HTML conversion was dropped in the copy I
Doug Cutting wrote:
For some reason the HTML conversion was dropped in the copy I received.
So here it is again.
Looks like this mailing list drops HTML attachments...
This time I zipped it. We'll see if that works.
Doug
FileFormats.zip
Description: Macintosh archive
--
To unsubscribe
Scott Ganyo wrote:
Nevertheless, I'm willing to accept that you have defined it as Lucene
standard style and I do abide by it when developing Lucene...
I don't think style should be (or even can be) mandated. When writing
new code from scratch, a developer should of course try use a style
Rasik Pandey wrote:
Developers,
Attached is the diff for MultiSearcher which seems to correct these
bugs. I have not yet found any problems caused by these changes in
testingbut we will keep you informed!
[ ... ]
diff -w -r1.4 MultiSearcher.java
96,98c100,105
public final
Rasik Pandey wrote:
Understood. I made the second change,in MulitSeacher, and it works on
this end. Do you think this change needs to be made in other places of
the lucene code, such as the SegmentsReader.readerIndex(int n) method,
as it uses what looks to be the same algorithm?
I was
Otis Gospodnetic wrote:
Every URL extracted from a fetched document needs to be looked up in
this VisitedURLsFilter. If not there already, it needs to be added to
it (and to the queue of URLs to fetch). If there already, it is thrown
away.
Because of this, the data structure that
Brian Goetz wrote:
Lets say we search for text retrieval. We want to find documents that
have text retrieval in the body OR in the keywords, but we want to
weight hits on the keywords more heavily. I can't boost the tokens in
the index base, so I have to do that through the query.
Tom Dunstan wrote:
I'd like some feedback on an idea that I have to extend lucene to hold the
extra information that it needs to stop me having to reparse the entire body
text again to generate excerpts.
Basically, to work out which sections of the text have the terms that
generate the
Stas Chetvertkov wrote:
Recently we met a necessity to pass Query objects through network. We
encoutered a problem that BooleanQuery cannot be serialized in spite of
abstract Query object is Serializable. The source of the problem is that
BooleanQuery holds a vector of BooleanClause objects
Dmitry Serebrennikov wrote:
I know that the FAQ says that they are, but in at least one instance in
my index it appears to be equal to 1.94something. Are the scores
guaranteed to be between 0 and 1
No.
and if not, what would it take to make
them such?
A different Similarity
documented.
Doug
Otis Gospodnetic wrote:
This sounds good to me, as it would lead us to pluggable similarity
computation....
I can refactor some of this tonight.
Otis
--- Doug Cutting [EMAIL PROTECTED] wrote:
This looks like a good approach. When I get a chance, I'd like to
make
Konrad Scherer wrote:
I am using lucene 1.2 (Java 1.4 on Solaris 7) and the xml indexer to
index ~24000 small xml documents. The finished and optimized index uses
around 340 MB disk space. The documents are reindexed once a week and
this has worked without any trouble for months. Recently the
Last week I checked in changes that provide a public API that lets
applications easily alter Lucene's scoring function. The API is
documented in the javadoc for the (now public) class
org.apache.lucene.search.Similarity.
Has anyone had a chance to try this?
Doug
--
To unsubscribe, e-mail:
Scott Ganyo wrote:
Now that we've committed to Java 2, I would not be opposed to removing
Enumeration references... or at least deprecating them in favor of
newer-style methods.
The javadoc for Enumeration says:
The functionality of this interface is duplicated by the Iterator
interface. In
Konrad Scherer wrote:
I have modified QueryParser.jj and PhrasePrefixQuery.java to allow
wildcard searches within phrases. This turned out to be a very involved
change going through a few revisions. I have tried to make the changes
as clean as possible.
Thanks for taking the time to work on
[ I moved this discussion to lucene-dev. -drc ]
This looks like a premature optimization gone bad.
Brian, you made this change. Would you like to fix it, or should I?
Doug
Chris D wrote:
I found that the current code in CVS prevents a
org.apache.lucene.search.DateFilter from functioning
Julien Nioche wrote:
This kind of modification could be done in almost all the methods of the
classes BooleanQuery and PhraseQuery, providing a small optimization (I did
not mesure it - but even small optimizations can be useful).
These computations are performed only once per search. I would
Perhaps IndexWriter is badly named. It might better be called
IndexAppender. It doesn't normally touch any of the index but the list
of segments, unless it has to merge some segments, in which case it
usually only touches a small subset of the index data.
IndexReader, on the other hand, is
Otis Gospodnetic wrote:
We had this in Lucene Sandbox? I never saw it committed, weird.
I just committed it today. The commit message bounced because it was
too big.
I can't get it from the repository, any idea why?
Some protections were wrong. I think I fixed it. Try now.
Doug
--
Otis Gospodnetic wrote:
I wonder about SnowballAnalyzer and SnowballFilter
classes.
The ctor of the later uses introspection to instantiate the appropriate
Stemmer.
In most use cases that will be the same Stemmer from call to call.
Seems like redundant work and objects created.
Wouldn't it be
Shah, Vineel wrote:
Here's what I'm trying to do:
A query that looks for for java unix windows in the keywords field of an index.
If the document has java unix, the score is .66..., regardless of any other factor. I want 1.0 for all three, .33... for just one, and no hit for none.
This is easy
Jonathan Baxter wrote:
How important is it for I/O performance that Lucene uses only one byte
to represent document length? Or are there reasons other than
performance for using so few bits?
To achieve good search performance, field-length normalization factors
must be memory-resident. So
Great article! I look forward to the rest of the series!
The Java Developers Journal also recently ran a cover story on Lucene.
Full text is not freely available, but the figures and examples are at:
http://www.sys-con.com/java/source.cfm?id=1777
Should we add a link to this article on the
I think you're proposing that the classes in
http://jakarta.apache.org/lucene/docs/lucene-sandbox/snowball/api/
be added to the core Lucene jar and release. Is that right?
I don't have a problem with this. Do others?
The Javadoc should probably also include a pointer to:
Sounds like a bug.
Can you please supply a complete, self-contained test case? Ideally as
a JUnit test class.
Thanks,
Doug
Rasik Pandey wrote:
Hello,
Can anyone explain we I would be seeing this when re-using a query
(MultiTermQuery or PrefixQuery, or any Query that doesn't implement the
sent previously.
Let me know if I should enter a bug report?
Thanks,
Rasik
-Message d'origine-
De : Doug Cutting [mailto:[EMAIL PROTECTED]]
Envoyé : vendredi 17 janvier 2003 20:47
À : Lucene Developers List
Objet : Re: java.lang.UnsupportedOperationException
Sounds like a bug.
Can you
Leo Galambos wrote:
When I want to search Linux, nothing is found.
This word is in every article in the content.
Or is something wrong?
Yes :)
why? log(1)=0. it is OK, I think :-))) so where's any problem?
Lucene's IDF computation is:
log( maxDoc / docFreq+1) + 1.0
Thus a term which
Lucene does not permit one to modify documents that are already indexed.
You must delete them and re-index them, even if changes are only to
non-indexed fields. Lucene should not be used as a document database.
It is a full-text indexing library, which, as a convenience, permits one
to store
I'm confused. The contract of this method is to return the top-scoring
nDocs. For a multi-searcher it must compute the top-scoring nDocs from
each sub-searcher, then find the top-scoring nDocs among these. If you
want more of the top-scoring documents, just pass in a larger value for
nDocs.
+1
I like this approach of modifying the query parser through subclassing.
We should consider taking this approach further, e.g., perhaps by making
addClause(), getFieldQuery() and getRangeQuery() into protected methods,
so that folks can modify their behavior too. Thoughts?
Also, I think we
1 - 100 of 407 matches
Mail list logo