Ajay Upadhyaya wrote:
We have large number of documents which are indexed(approx 1M), the
size of the index is approx 1G. . We have few Keyword fields as well
as few UnStored fields.
Now there is a requirement to change names of few fields. The
application code can be easily changed to created the
Andrzej Bialecki wrote:
Doug Cutting wrote:
You cannot easily change the field names in the index.
..through the existing API, that is. Because you can change the content
of the *.fnm file appropriately, right?
Right. One could write something that would re-write all of the .fnm
files. It
Erik Hatcher wrote:
Ultimately, though, the decision to refactor the codebase to use
interfaces more pervasively lies with Doug.
Actually the decision lies not with me, but with the Lucene PMC as a
group, according to Apache's voting process:
http://www.apache.org/foundation/voting.html
But, lik
Erik Hatcher wrote:
I just tried regenerating, which automatically pulls from CVS, and got
this error:
/Users/erik/dev/lucene/java/contrib/snowball/snowball/website/p/
generator.c:425: internal compiler error: in extract_insn, at recog.c:2175
[apply] Please submit a full bug report,
[a
Erik Hatcher wrote:
If you see regeneration differences would you please commit them?
There were no differences.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Guillermo Payet wrote:
In any case... the point being that we want to just have one
IndexSearcher for the whole App.
But.. were starting to run out of file handles on our server,
and an lsof returns lots and lots of these:
java 22755 tomcat 320r REG 9,3 2992899 1177377 /var/ix/_2a8.cfs (deleted
Erik Hatcher wrote:
I think something like this would make a handy addition to our contrib
area at least.
Perhaps.
What use cases cannot be met by regular expression matching?
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For a
Wolfgang Hoschek wrote:
The classic fuzzy fulltext search and similarity matching that Lucene is
good for :-)
So you need a score that can be compared to other matches? This will be
based on nothing but term frequency, which a regex can compute. With a
single document there'll be no IDFs, so y
Bernhard Messer wrote:
I'm not a fan of outdated software or historical systems. So i think the
best would be to keep lucene still backward compatible with version 1.9
and perform the switch to JDK 1.4 with lucene 2.0.
That sounds like a good plan.
Which raises the question, when should we make t
[EMAIL PROTECTED] wrote:
http://issues.apache.org/bugzilla/show_bug.cgi?id=31841
[EMAIL PROTECTED] changed:
What|Removed |Added
Status|NEW |RESOLVED
Wolf Siberski wrote:
In each case applications should call a corresponding Searcher method.
Here I don't agree completely and have another suggestion to resolve that
issue. The affected methods are low-level API methods anyway,
and even before their javadoc referred application developers to othe
Erik Hatcher wrote:
There are two .java files attached that may not make it through to the
list. These are simple wrappers that do exactly what you'd expect. The
idea is to make dealing with Lucene Hits more "Java like" with an
Iterator, which in turn makes this much more amenable to Groovy.
+
Erik Hatcher wrote:
I fixed it. TermInfosTest is, however, not a real JUnit test case, so I
wonder how useful it is at all...
I'm curious - did your fix change the code to go against a new API?
Yes, but not a public API.
In
other words, is there something that has changed that breaks API
compati
Erik Hatcher wrote:
I'm not quite sure
where to put MemoryIndex - maybe it deserves to stand on its own in a
new contrib area?
That sounds good to me.
Or does it make sense to put this into misc (still
in sandbox/misc)? Or where?
Isn't the goal for sandbox/ to go away, replaced with contrib/
Please find attached something I wrote today. It has not been yet
tested extensively, and the documentation could be improved, but I
thought it would be good to get comments sooner rather than later.
Would folks find this useful?
Should it go into the core or in contrib?
Doug
Index: src/java/or
Thanks for doing all this! It looks great!
Erik Hatcher wrote:
However it seems much simpler for us to only distribute
lucene-XX.tar.gz/zip and lucene-XX-src.tar.gz/.zip rather than
distributing each contrib component separately.
I agree.
The current build
process builds the same 4 distributio
Monsur Hossain wrote:
George, what about SharpZipLib:
http://www.icsharpcode.net/OpenSource/SharpZipLib/Default.aspx
It's a third-party project, but its written in C# and is under GPL.
GPL unfortunately means that the library cannot be distributed by Apache
with Lucene.Net.
Doug
-
I'd prefer if the list of file extensions was in a single place, and
that place should be somewhere in the index package, not in the store
package.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail
Erik Hatcher wrote:
My rationale for keeping all the contrib components in their own
subdirectories was to allow room for eventual documentation or other
files that might want to come along for the ride (like maybe a
dependent ASL'd JAR?).
That makes sense.
I'd be happy to change it if that i
[EMAIL PROTECTED] wrote:
don't declare Exceptions that are never thrown; remove an unused variable
When these are implementing a pubic interface or abstract method I think
it is good to keep the exception declaration, as it is a part of the
interface. That way, if an exception needs to be thrown
There's a post over at SearchEngineWatch theorizing about how Google
produces summaries.
http://forums.searchenginewatch.com/showthread.php?threadid=5448
Lucene's current highlighter doesn't easily support multi-fields, nor
does it take phrasal matching into account. It might be useful to have
Attached is a patch that makes it possible to supply a user-specified
parser to FieldCache. For example, one might use this to process a date
field as ints even if was not indexed as a decimal integer.
Comments?
Doug
Index: src/java/org/apache/lucene/search/FieldCache.java
=
ument.
With that principle in mind I should really make sure that if I search for:
("Doug Cutting" AND lucene) OR google
I shouldn't highlight "Doug Cutting" in a matching document that has
google but not lucene.
Shouldn't the search code already take care of tha
tch as a new Podling.
The Nutch proposal vote is at:
http://www.mail-archive.com/general@incubator.apache.org/msg04201.html
The Nutch proposal is at:
http://wiki.apache.org/incubator/NutchProposal
The Nutch mentors are:
Doug Cutting
Erik Hatcher
If you accept this podling, please add Erik and
Background: In http://issues.apache.org/bugzilla/show_bug.cgi?id=34673,
Yonik Seely proposes a ConstantScoreQuery, based on a Filter. And in
http://www.mail-archive.com/lucene-dev@jakarta.apache.org/msg08007.html
I proposed a mechanism to promote the use of Filters. Through all of
this, Paul
Yonik Seeley wrote:
Could you elaborate on the advantage of having say a TermQuery that
could be either normal-scoring or constant-scoring vs two different
Query classes for doing this? They seem roughly equivalent.
You could code it that way too. It would require exposing TermWeight
and TermSco
Attached is a patch to build.xml and common-build.xml that makes 'ant
test' succeed. The problem is that, classically, unit tests in Lucene
are named Test*.java, but there are tests in contrib named *Test.java,
and there are non-unit tests in src/test named *Test.java. Until this
is resolved,
Doug Cutting wrote:
> Would folks find this useful?
Since the general feedback was positive, I committed this.
Chuck Williams wrote:
Yes, very useful, especially if you added one additional feature that
looks straightforward from the code below. That is a facility to append
the stored fie
[EMAIL PROTECTED] wrote:
controversial: do not fail the build for contrib components not building successfully. this is to make Gump happy for now, but in the future a more granular conditional build of each contrib project may be desirable
+1
The contrib stuff doesn't have the same guarantees as
Robert Engels wrote:
I have always thought that the norms should be an interface, rather than
fixed, as there are many uses of lucene where norms are not necessary, and
the memory overhead is substantial.
I agree, but that's not the whole story.
If one seeks merely to avoid caching the norms i
Robert Engels wrote:
Attached are files that dramatically improve the searching performance
(2x improvement on several hardware configurations!) in a multithreaded,
high concurrency environment.
This looks like some good stuff! Can you perhaps break it down into
independent, layered patches?
Robert Engels wrote:
Ok. Attached are the updated files. I also forgot some of the changed files
the first time around (CompoundFileReader also had synchronization that
needed to be removed).
Again, it would be much easier to understand if you supplied patches,
i.e., diffs, so that we can focu
Robert Engels wrote:
2. I agree that creating NioFSDirectory rather than modifying FSDirectory. I
originally felt the memory mapped files would be the fastest, but it also
requires OS calls, the "caching" code is CONSIDERABLY faster, since it does
not need to do any JNI, or make OS calls.
On th
Arvind Srinivasan wrote:
Some options are:
(1)Commit the counter after the newSegmentName call. This way we never reuse the
the segmentName.
(2) Add a callback API to directory interface for a new Segment Creation
allowing
the directory interface to clean up, on a new segment write.
(3) Provi
Doug Cutting wrote:
I've attached a patch. Does this fix things for you?
Oops. That had a bug.
Here's a revised patch. It now passes all unit tests.
Doug
Index: src/java/org/apache/lucene/store/FSDirectory.java
=
Arvind Srinivasan wrote:
The patch on the follow up mail does look good. However, I have additional
concerns:
(a) deleteFile call may fail. eg. File is left open from the previous exception.
This makes me believe the ideal scenario is to not to reuse the segment name
once the newSegment call iss
Doug Cutting wrote:
Attached is a patch that makes it possible to supply a user-specified
parser to FieldCache. For example, one might use this to process a date
field as ints even if was not indexed as a decimal integer.
As there were no objections, I have committed this patch.
Doug
Doug Cutting wrote:
I think the fix is much simpler. This is a bug in FSDirectory.
Directory.createOutput() should always create a new empty file, and
FSDirectory's implementation does not ensure this. It should try to
delete the file before opening it and/or call
RandomAccessFile.setL
Daniel Naber wrote:
What do you think? If this gets accepted, it also needs a better name.
It looks reasonable to me.
As for names, IndexWriter would be a good one for this, and
IndexAppender would be a better name for what's now called IndexWriter.
Unfortunately, I don't see a way to make
Daniel Naber wrote:
can someone please check my changes to fileformats.xml regarding the
compound format? (not yet on the website, call "ant" in the "site"
directory to build the files locally).
Looks good.
One improvement: You could define FileData more formally as something like:
FileData
Daniel Naber wrote:
On Friday 03 June 2005 19:02, Doug Cutting wrote:
FileLength[i] ->
(i==FileCount) ? DataOffset[i+1] : EOF) - DataOffset[n]
Not sure if that really helps. At least I find it confusing, as neither the
"?" operator nor the "EOF" occurs anywhe
[EMAIL PROTECTED] wrote:
--- lucene/java/trunk/src/java/org/apache/lucene/store/FSDirectory.java
(original)
+++ lucene/java/trunk/src/java/org/apache/lucene/store/FSDirectory.java Mon Jun
6 10:52:12 2005
@@ -52,8 +52,8 @@
if (name.endsWith("."+IndexReader.FILENAME_EXTENSIONS[i]))
Bernhard Messer wrote:
Therefore i would like to propose two changes:
1) we should store the extension in a hash and not in String[] to have a
faster lookup
Do you mean to use something like:
String lastDot = name.lastIndexOf('.');
if (lastDot >= 0) {
String nameExt = name.substring(lastDot
Bernhard Messer wrote:
sorry for the confusion. On the first look, i thought the new class
IndexFileNames, containing the necessary constant values, fits perfect
into org.apache.lucene.index. After a more detailed look, i get the
feeling that it would be much better to place the new class into
Bernhard Messer wrote:
I finished the changes and commited the changes. There are two new
classes in package org.apache.lucene.index.
org.apache.lucene.index.IndexFileNames contains common lucene related
filenames and extensions, the scope of the class itself and it's members
are package. org.
+1
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
I started a Lucene status page for July at:
http://wiki.apache.org/jakarta-lucene/Report-2005-07
Please help populate this page. It should contain news related to the
Lucene top-level project (not just the Java sub-project) since Lucene
became a top-level project at the beginning of this year
Paul Smith wrote:
I know there's a mapreduce branch in the nutch project, but is there
any plan/talk of perhaps integrating something like that directly into
the Lucene API? For projects that need a lower-level API like Lucene,
rather than the crawl-like nature of Nutch, the potential to i
Doug Cutting wrote:
Perhaps we need to factor Nutch into two projects, one with NDFS and
MapReduce and the other with the search-specific code. This falls
almost exactly on package lines. The packages
org.apache.nutch.{io,ipc,fs,ndfs,mapred} are not dependent on the rest
of Nutch.
FYI
Ken Krugler wrote:
The remaining issue is dealing with old-format indexes.
I think that revving the version number on the segments file would be a
good start. This file must be read before any others. Its current
version is -1 and would become -2. (All positive values are version 0,
for b
[EMAIL PROTECTED] wrote:
How will the difference impact String memory allocations? Looking at
the String code, I can't see where it would make an impact.
I spoke a bit too soon. I should have looked at the code first. You're
right, I don't think it would require more allocations.
When con
Yonik Seeley wrote:
I've been looking around... do you have a pointer to the source where just
the suffix is converted from UTF-8?
I understand the index format, but I'm not sure I understand the problem
that would be posed by the prefix length being a byte count.
TermBuffer.java:66
Things
Yonik Seeley wrote:
A related problem exists even if the prefix length vInt is changed to
represent the number of unicode chars (as opposed to number of java chars),
right? The prefix length is no longer the offset into the char[] to put the
suffix.
Yes, I suppose this is a problem too. Sigh
Yonik Seeley wrote:
Where/how is the Lucene ordering of terms used?
An ordering is necessary to be able to find things in the index.
For the most part, the ordering doesn't seem matter... the only query that
comes to mind where it does matter is RangeQuery.
For back-compatibility it would be
Wolfgang Hoschek wrote:
I don't know if it matters for Lucene usage. But if using
CharsetEncoder/CharBuffer/ByteBuffer should turn out to be a
significant problem, it's probably due to startup/init time of these
methods for individually converting many small strings, not inherently
due to
I don't in general disagree with this sort of optimization, but I think
a good fix is a bit more complicated than what you posted.
Lukas Zapletal wrote:
And here comes the fixes:
OutputStream:
/**
* Writes an array of bytes.
*
* @param b
*the bytes
Paul Elschot wrote:
I suppose one of these cases are when many terms are used in a query.
Would it be easily possible to make the buffer size for a term iterator
depend on the numbers of documents to be iterated?
Many terms only occur in a few documents, so this could be a
nice win on total buf
Paul Elschot wrote:
I tried delaying the buffer allocation in BufferedIndexInput by
using this clone() method:
public Object clone() {
BufferedIndexInput clone = (BufferedIndexInput)super.clone();
clone.buffer = null;
clone.bufferLength = 0;
clone.bufferPosition = 0;
clone.
Erik Hatcher wrote:
I'm using the trunk of Subversion (pretty much what 1.9 will be) on all
my projects and it is quite stable. I defer to the others on when we
release it as 1.9 officially, though.
I think the 1.9 release should be made soon. What is required is a
motivated committer wit
Erik Hatcher wrote:
I haven't seen this come across the java-dev list (I could have missed
it though). Everyone ok with moving to JIRA?
+1
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL P
Scott Ganyo wrote:
What is required to make the release?
The (somewhat dated) steps are at:
http://wiki.apache.org/jakarta-lucene/ReleaseTodo
Probably the first thing to do is to update these (cvs -> svn) and see
if folks suggest any other improvements.
We should start with a 1.9-rc1 relea
Chris Hostetter wrote:
2) Can you think of a clean way for individual applications to eliminate
norms (via subclassing the lucene code base - ie: no patching)
Can't you simply subclass FilterIndexReader and override norms() to
return a cached dummy array of Similarity.encodeNorm(1.0f) f
Robert Engels wrote:
Doesn't this cause a problem for highly interactive and large indexes? Since
every update to the index requires the rewriting of the norms, and
constructing a new array.
The original complaint was primarily about search-time memory size, not
update speed. I like the propo
Marvin Humphrey wrote:
What are the advantages of the With class? Why not just obtain the
lock, run a block, and release the lock?
The release should be in a 'finally' block. 'With' enforces that.
Doug
-
To unsubscribe, e-
Last week I proposed to the Lucene PMC that we make Yonik Seeley a
committer on Lucene Java. I am pleased to announce that other PMC
members agreed. Welcome, Yonik!
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additio
Chris Hostetter wrote:
2) On the subject of commiting and becoming a commiter: I've noticed a few
questions recently about why/when patches can/will-be commited; and
Yonik's new status has me wondering about how people become commiters, and
what guidelines exist for commiters to know how/when to
Erik Hatcher wrote:
As for accepting patches - with Lucene I'm personally very conservative
with applying patches.
There are good reasons to be conservative. When a committer commits a
patch he or she vouches for the quality of that patch. Any problems
that ensue are, to some degree, the re
Robert Engels wrote:
The reason for using Nio and not IO is IO requires multiple file handles per
file. There are already numerous bugs/work-arounds in Lucene to limit the use
of file handles (as this is a OS limited resource), so I did not wish to
further increase the number of file descripto
Grant Ingersoll wrote:
Should I get the source and propose a patch or is there somebody who is
in "charge" of the website?
A patch would be great. The site is generated from the xdocs directory
with 'ant docs'. You also need to check out jakarta-site2 as
../jakarta-site2.
Doug
--
Marvin Humphrey wrote:
I think it's time to throw in the towel.
Please don't give up. I think you're quite close.
I would be careful using CharBuffer instead of char[] unless you're sure
all methods you call are very efficient. You could try avoiding
CharBuffer by adding something (ugly) l
Another approach might be to, instead of converting to UTF-8 to strings
right away, change things to convert lazily, if at all. During index
merging such conversion should never be needed. You needn't do this
systematically throughout Lucene, but only where it makes a big
difference. For exa
[EMAIL PROTECTED] wrote:
+23. Added regular expression queries, RegexQuery and SpanRegexQuery.
+Note the same term enumeration caveats apply with these queries as
+apply to WildcardQuery and other term expanding queries.
+(Erik Hatcher)
I don't like adding more error-prone stuff lik
Erik Hatcher wrote:
The downside is scoring closer matches (in say the WildcardQuery) would
no longer be possible, right?
Right. We could implement a scorer that keeps a byte array of scores
instead of a bit vector, using Similarity.java's 8-bit float format.
That would use more memory, but
Paul Elschot wrote:
I think loosing the field boosts for PrefixQuery and friends would not be
advisable. Field boosts have a very big range and from that a very big
influence on the score and the order of the results in Hits.
It should not be hard to add these. If a field name is provided, the
Yonik Seeley wrote:
As far as API goes, I guess there should be a constructor
ConstantScoreQuery(Filter filter, String field)
If field is non-null, then the field norm can be multiplied into the score.
You could implement this with a scorer subclass that multiplys by the
norm, removing a condi
Paul Elschot wrote:
Not using the document term frequencies in PrefixQuery would still
leave these as a surprise factor between PrefixQuery and TermQuery.
Should we dynamically decide to switch to FieldNormQuery when
BooleanQuery.maxClauseCount is exceeded? That way queries that
currently wo
Yonik Seeley wrote:
Scoring recap... I think I've seen 4 different types of scoring
mentioned in this thread for a term expanding query on a single field:
1) query_boost
2) query_boost * (field_boost * lengthNorm)
3) query_boost * (field_boost * lengthNorm) * tf(t in q)
4) query_boost * (field_b
Yonik Seeley wrote:
Totally untested, but here is a hack at what the scorer might look
like when the number of terms is large.
Looks plausible to me.
You could instead use a byte[maxDoc] and encode/decode floats as you
store and read them, to use a lot less RAM.
// could also use a bitse
Yonik Seeley wrote:
Hmmm, very interesting idea.
Less than one decimal digit of precision might be hard to swallow when
you have to add scores together though:
smallfloat(score1) + smallfloat(score2) + smallfloat(score3)
Do you think that the 5/3 exponent/mantissa split is right for this,
or wo
In general I would not take this sort of profiler output too literally.
If floatToRawIntBits is 5x faster, then you'd expect a 16% improvement
from using it, but my guess is you'll see far less. Still, it's
probably worth switching & measuring as it might be significant.
Doug
Paul Smith wro
Yonik Seeley wrote:
I'm not sure I understand why this is. epsilon is based on 1,
(smallest number such that 1-epsilon != 1, right?). What's special
about 1?
1 is special for multiplication, but, you're right, not so special for
addition, the operation in question. The thing that makes addi
Yonik Seeley wrote:
mantissa_bits=4, zeroExp=4:
1) 0.0021972656
2) 0.0024414062
70) 0.875
71) 0.9375
72) 1.0
73) 1.125
74) 1.25
75) 1.375
76) 1.5
254) 7340032.0
255) 7864320.0
This would be a good choice. I think the following is also a contender:
mantissa_bits=5, zeroExp=2:
1) 0.033203125
2)
Yonik Seeley wrote:
Hmmm, is .03->2000 really enough range?
Seems like the choice is between that and .0005->200 will one less
mantissa bit.
Consider the failure modes:
With the .0005->200 range we'll fail to distinguish close-scoring
matches in more commmon score ranges, while more c
Yonik Seeley wrote:
Do you think that underflow should map to the smallest representable
number (like norm encoding does) or 0?
The smallest representable, I think.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additio
Erik Hatcher wrote:
Would this be an acceptable change to commit?
The javadoc is pretty slim!
This adds another field to Field which is not stored and will hence not
be reflected in hit documents. So it will confuse folks.
Doug
-
Marvin Humphrey wrote:
As I work on the ports for FSDirectory and FSLock, I'm wondering why
the file system directory itself shouldn't be used to hold the
lockfiles. If there's a write permissions problem, such as an
FSDirectory on CDrom, disableLocks takes care of it. Is there some
scen
[EMAIL PROTECTED] wrote:
+ * Invoked, by DocumentWriter, before indexing a Field instance if
+ * terms have already been added to that field. This allows custom
+ * analyzers to place an automatic position increment gap between
+ * Field instances using the same field name. The default
Yonik Seeley wrote:
a) do any other committers want a license, and
Why not just include all committer names?
b) would we be willing to put their logo somewhere in exchange?
Perhaps we should reserve that until we find that Lucene has been
significantly improved by YourKit.
Doug
Erik Hatcher wrote:
While there have been several different topics brought up on this
thread, it seems we're diverging from the original idea. Let's
consider the most basic use case example here, and I'm making it
intentionally as concrete as possible:
A Swing client performs searches by
Nicolas Belisle wrote:
Since Java Content Repository uses java.io.InputStream, I extended
RAMInputStream to achieve random reads from the java.io.InputStream.
(Have a better idea ?)
So you're buffering the entire file? That doesn't sound good. If there
are no provisions for random access, t
Amol Bhutada wrote:
If I have a reader and searcher on a indexdata folder and another
indexwriter writing documents to the same indexdata folder, do I need to
close existing reader and searcher and create new so that newly indexed
data comes into search effect?
[ moved from user to dev list]
Yes, that's a good start. Your patch does not handle deletions
correctly. If a segment has had deletions since it was opened then its
deletions file needs to be re-read. I also think returning a new
IndexReader is preferable to modifying one, since an IndexReader is
often used as a cache key
I just setup nightly builds for Lucene on our new Solaris zone.
These are at:
http://cvs.apache.org/dist/lucene/java/nightly/
I've updated the header for the binary release page to note this:
http://www.apache.org/dist/jakarta/lucene/binaries/
(BTW, we should sometime move our releases out of
Daniel Naber wrote:
On Mittwoch 25 Januar 2006 23:26, Doug Cutting wrote:
I just setup nightly builds for Lucene on our new Solaris zone.
These are at:
http://cvs.apache.org/dist/lucene/java/nightly/
Thanks! What about putting that on the front page as a news item?
+1
Doug
mark harwood wrote:
For these outlier situations is it worth adding a
"maxDf" property to TermQuery like BooleanQuery's
maxClause query-time control? I could fix my problem
in my own app-specific query construction code but I
wonder if others would find it a useful fix to add to
TermQuery in the
I'd like to push out a 1.9 release candidate in the next week or so.
Are there any patches folks are really hoping to sneak into 1.9? If so,
now's the time.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional comm
Doug Cutting wrote:
I'd like to push out a 1.9 release candidate in the next week or so. Are
there any patches folks are really hoping to sneak into 1.9? If so,
now's the time.
This is a great time to improve the javadoc. I see lots of blank boxes
which could use a bit of descri
Chris Hostetter wrote:
in the case where doc boosts and field boosts aren't used, it seems like
it would be very easy to write a maintenance app that did something
like...
get instance of similarity based on input
foreach fieldName in input {
int[] termCounts = new int[maxDoc];
Chris Hostetter wrote:
I'm not sure what the ASF/Lucene policy is on keeping Copyright/License
statements in source files up to date, but should they all be updated to
say "Copyright 2006 The Apache Software Foundation" prior to a 1.9
release?
It shouldn't hurt!
This week is pretty booked for
DM Smith wrote:
Would that mean that 1.9 and 2.0 will be released at the same time?
No. 2.0 will be released after 1.9. The primary change will be that
all deprecated methods are removed, but there may be other changes, but
probably not many.
Doug
1 - 100 of 585 matches
Mail list logo