Henri Yandell wrote:
Redirect of jakarta.apache.org/lucene to lucene.apache.org/java/docs/index.html
I noticed there's a commented out redirect in the .htaccess, so after
adding my own I deleted it again and left the redirect off for the
moment. Unsure if there's a reason the commented out bit is
Erik Hatcher wrote:
When Doug is cool with re-enabling the redirect, it's fine with me.
I'm cool with it if it works. Why not re-enable it, search for
site:apache.org lucene on Google, Yahoo! and MSN, and click on the
first few links. If these work, then I'm okay with the redirect.
As we
Henri Yandell wrote:
Your download page is already separate, you're using the global closer.cgi file.
So we need to:
- rename Lucene Java's mailing lists, with forwards put into place.
- add a mailing list page to Lucene Java's website, modelled after
Garrett Rooney wrote:
Actually, currently we've got both lucene4c and java commits going to
[EMAIL PROTECTED], and there was some talk of just leaving it
that way, since it isn't that much traffic and it encourages people to
keep an eye on what's going on in other languages.
I think that's a
Kevin A. Burton wrote:
Wolf Siberski wrote:
Kevin A. Burton wrote:
I see following issues with your patch:
- you changed the DEFAULT_... semantics from constant to modifiable,
but didn't adjust the names according to Java conventions
(default_...).
Java doesn't have any naming conventions
Kevin A. Burton wrote:
Doug Cutting wrote:
Wolf Siberski wrote:
So, if anything at all, I would rather opt for making these constants
private :-).
I agree. In general, fields should either be final, or private with
accessor methods. So, we could change this to:
private static int
Attached is a patch which delays reading of index terms until it is
first accessed. The cost of this is another file descriptor, until the
terms are accessed, when it is closed. The benefit is that operations
that do not require access to index terms are much faster and use much
less memory.
Kevin A. Burton wrote:
You know ... the javadoc on the site doesn't include non-public classes
like TermInfosWriter. Confused me for a second.
That's because it's not public. The javadoc on the site is to document
the public api. This is not a bug, but a feature.
Also.. the site doesn't
Kevin A. Burton wrote:
Also, I assume that the reason you make the reader field protected is
because getReader() is not sufficient, i.e., you want to set the
reader. This would stylistically be better done with a setReader()
method, no? Do you only change it at construction, or at runtime?
Wolf Siberski wrote:
The price is an extension (or modification) of the
Searchable interface. I've added corresponding search(Weight...) methods
to the existing search(Query...) methods and deprecated the latter.
I think this is the right solution.
If Searchable is meant to be Lucene internal,
Wolf Siberski wrote:
Now I found another solution which requires more changes, but IMHO is
much cleaner:
- when a query computes its Weight, it caches it in an attribute
- a query can be 'frozen'. A frozen query always returns the cached
Weight when calling Query.weight().
Orignally there was no
Paul Elschot wrote:
Would you mind if some pieces of your reply end up in the
javadocs?
Not at all.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
+1
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
+1
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
George Aroush wrote:
Any thoughts on Lucene.Net/dotLucene package name are welcome.
I agree that Lucene.Net is a better name. It's more consistent with
Lucene Java and Lucene4c, the names for other ports of Lucene. I think
it's okay to reclaim the name of an abandonded project, especially if
Daniel Naber wrote:
could someone (Doug?) make me an administrator for the old Lucene project
at sourceforge?
Done.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Henri Yandell wrote:
On names, Lucene Java might hit trademark issues I guess. So potential
worry there.
Good point. Although I note that Apache already has projects called
Xerces Java and Xalan Java. Sun says:
http://www.sun.com/policies/trademarks/#20c
So, technically, the fullname of the
Erik Hatcher wrote:
Doug - do you have your Forest work handy? Or has anyone else stepped
up to build the web site?
I don't have anything reusable. I converted Nutch from a different (not
Anakia) XML-based site to Forrest with little difficulty (mostly using
string replace in Emacs).
I
Erik Hatcher wrote:
I have checked out our current site to the lucene.apache.org area, and
I've also set up a redirect from the jakarta.apache.org/lucene area.
Keep in mind, there are two projects here:
1. Porting Java Lucene's site to Forrest. This should be structured as
a sub-project of
Garrett Rooney wrote:
Additionally it would be good to work on updating the disk format
documentation, I've found several cases where the docs are quite out of
date compared to the current code. It's hard to expect the various
different ports to maintain compatibility when the formats are only
Garrett Rooney wrote:
Agreed. Java Lucene is a subproject of the Lucene TLP, leaving the
existing Java Lucene site there for the time being seems ok, just so we
have something there, but we should endeavour to put up something more
permanent ASAP.
I think, for the present,
Erik Hatcher wrote:
I'm really at the limit of my bandwidth - I've got the sandbox
restructuring effort on my plate right now and would like it if someone
could pick up the ball on the web site side of things.
Then perhaps you shouldn't have redirected everything to
lucene.apache.org...
We
Erik Hatcher wrote:
It also might be a good time to think about mailing list names. There
was a request on infrastructure@ to move [EMAIL PROTECTED] to
[EMAIL PROTECTED], would it make more sense to move it to [EMAIL PROTECTED]
NOW you tell me :)
I think until we have these elusive other
Doug Cutting wrote:
And we also want to try not to break URLs when we move things. For this
reason it's best to move things as few tims as possible, so that we
don't end up with a confusing set of redirects.
More to the point, we also want to try not to break email addresses. So
the fewer
Bernhard Messer wrote:
Doug, you placed a copy of the website in the java directory. In both,
the original and the java directory the api directory is missing. I
can't copy it into because of the access rights :-(
Argh. The group protection is 'lucene', as it should be, but you're not
in
Erik Hatcher wrote:
I've amended my request for e-mail lists here with Doug's preference:
http://issues.apache.org/jira/browse/INFRA-195
Do others agree this is the best approach? I don't mean to be
autocratic. Do we imagine different pools of users and developers for
different Lucene
Oscar Picasso wrote:
Hi,
I am currently implementing a Directory backed by a Berkeley DB that I am
willing to release as an open source project.
Besides the internal implementation, it differs from the one in the sandbox in
that it is implemented with the Berkeley DB Java Edition.
Using the Java
[ Please ignore my previous message. I somehow hit Send before typing
anything! ]
Oscar Picasso wrote:
However with a relatively high number of random insertions, the cost of the
new IndexWriter / index.close() performed for each insertion is two high.
Did you measure that? How much slower was
Paul Elschot wrote:
I learned a lot by adding some javadocs to such classes. I suppose Doug
added the Expert markings, but I don't know their precise purpose.
The Expert declaration is meant to indicate that most users should not
need to understand the feature. Lucene's API seeks to be both
Erik Hatcher wrote:
Also, we should package a lucene-XX-all.zip/.tar.gz that includes all
the contrib pieces also allowing someone to simply download Lucene and
all the packaged contrib pieces at once.
I'll go further: that should be the only download. We should avoid
having a bunch of
Erik Hatcher wrote:
Hmmm good point. I hadn't considered access control. A migration
will be performed later today, and I think it will initially be a test
migration for me to verify. I'll double-check with Justin, who's doing
the conversion, on how access control will be initially
Erik Hatcher wrote:
The decision was a bit slow to get out, but Lucene has been approved for
TLP.
Thanks for pushing this through!
I
propose we simply import our two CVS repositories in with all of
jakarata-lucene as the root of the repository and jakarta-lucene-sandbox
under sandbox in the
Erik Hatcher wrote:
On Feb 1, 2005, at 3:13 PM, Doug Cutting wrote:
I think we want Java Lucene to be a sub-project of Lucene. So the
repository should be something like:
https://svn.apache.org/repos/asf/lucene/java
I already put in the request for this initial svn structure:
/asf/lucene
Doug Cutting wrote:
It would translate a query t1 t2 given fields f1 and f2 into
something like:
+(f1:t1^b1 f2:t1^b2)
+(f2:t1^b1 f2:t2^b2)
Oops. The first term on that line should be f1:t2, not f2:t1:
+(f1:t2^b1 f2:t2^b2)
f1:t1 t2~s1^b3
f2:t1 t2~s2^b4
Doug
Chuck Williams wrote:
That expansion is scalable, but it only accounts for proximity of all
query terms together. E.g., it does not favor a match where t1 and t2
are close together while t3 is distant over a match where all 3 terms
are distant. Worse, it would not favor a match with t1 and t2 in
Christoph Goller wrote:
The similarity specified for the search has to be modified so that both
idf(...) AND queryNorm(...) always return 1 and as you say everything
except for tf(term,doc)*docNorm(doc) could be precompiled into the boosts
of the rewritten query. coord/tf/sloppyFreq computation
Maybe we should just call it lucene.apache.org, and move the current
Lucene project to lucene.apache.org/java? The other projects we imagine
adding (Nutch, DotLucene, CLucene, etc.) are all Lucene-related, no?
Lucene has a pretty good brand name...
Doug
Otis Gospodnetic wrote:
ir.apache.org
Wolf Siberski wrote:
Doug Cutting wrote:
So, when a query is executed on a MultiSearcher of RemoteSearchables,
the following remote calls are made:
1. RemoteSearchable.rewrite(Query) is called
After that step, are wildcards replaced by term lists?
Yes.
I haven't taken a look at the rewrite
Chuck Williams wrote:
Doug Cutting wrote:
It would indeed be nice to be able to short-circuit rewriting for
queries where it is a no-op. Do you have a proposal for how this
could
be done?
First, this gets into the other part of Bug 31841. I don't believe
MultiSearcher.rewrite() is ever
Erik Hatcher wrote:
The questions still remain, though, and lawyers do want to know the
answers:
- How did JDK code get into Lucene's codebase to begin with?
I put it there in a moment of ignorance way back as a hack in order to
make things run in an older version of the JVM.
Chuck Williams wrote:
I was thinking of the aggressive version with an index-time solution,
although I don't know the Lucene architecture for distributed indexing
and searching well enough to formulate the idea precisely.
Conceptually, I'd like each server that owns a slice of the index in a
Chuck Williams wrote:
There needs to be a way to create the aggregate docFreq table and keep
it current under incremental changes to the indices on the various
remote nodes.
I think you're getting ahead of yourself. Searchers are based on
IndexReaders, and hence doFreqs don't change until a new
Wolf Siberski wrote:
Chuck Williams wrote:
This is a nice solution! By having MultiSearcher create the Weight, it
can pass itself in as the searcher, thereby allowing the correct
docFreq() method to be called. This is similar to what I tried to do
with topmostSearcher, but a much better way to
Sigh. This stuff would get a lot simpler if we were able to use Java
1.4's FileLock. Then locks would be automatically cleared by the OS if
the JVM crashes.
Should we upgrade the JVM requirements to 1.4 for Lucene's 1.9/2.0
releases and update the locking code?
Doug
Luke Shannon wrote:
Here
Terry Steichen wrote:
Would it be
possible to optimize the operation to use 1.4 runtime features but
retain the option, if desired to run in a legacy (1.3) environment,
perhaps in a degraded mode?
Lucene 1.4.3 is a degraded mode, no?
There are still back-compatibility issues. To be safe,
Chuck Williams wrote:
As Wolf does, I hope a committer with deep knowledge of Lucene's design
in this area will weigh in on the issue and help to resolve it.
The root of the bug is in MultiSearcher.search(). This should construct
a Weight, weight the query, then score the now-weighted query.
Chuck Williams wrote:
This is a nice solution! By having MultiSearcher create the Weight, it
can pass itself in as the searcher, thereby allowing the correct
docFreq() method to be called.
Glad to hear it at least makes sense... Now I hope it works!
I'm still left wondering if having
markharw00d wrote:
If we intend to make more use of filters this may be an appropriate time
to raise a general question I have on their use. Is there a danger in
tieing them to a specific implementation (java.util.BitSet)?
I do not object in principal to replacing BitSet with an interface,
Bernhard Messer wrote:
Why not implementing a small utility class, f.e CompoundFileUtil.java
within the org.apache.lucene.index Package ? This class could be public
and implement the necessary functionality. This is what i would prefer,
because we don't have to change the visibility of
Filters are more efficient than query terms for many things. For
example, a RangeFilter is usually more efficient than a RangeQuery and
has no risk of triggering BooleanQuery.TooManyClauses. And Filter
caching (e.g., with CachingWrapperFilter) can make otherwise expensive
clauses almost free,
Chuck Williams wrote:
Finally, I'd suggest picking content that has multiple fields and allow
the individual implementations to decide how to search these fields --
just title and body would be enough. I would like to use my
MaxDisjunctionQuery and see how it compares to other approaches (e.g.,
Garrett Rooney wrote:
The least effort way of doing that would be to include both the core
and sandbox under the same trunk, but again, that implies that you
ALWAYS tag and branch them together, and sometimes you may not want to
do that.
I think we should always branch these together. To my
Chuck Williams wrote:
Another issue will likely be the tf() and idf() computations. I have a
similar desired relevance ranking and was not getting what I wanted due
to the idf() term dominating the score. [ ... ]
Chuck has made a series of criticisms of the DefaultSimilarity
implementation.
Dan Climan wrote:
Shouldn't the call to Similarity.decodeNorm be replaced with a call to
Similarity.getDefault().decodeNorm
decodeNorm is a static method.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands,
Murray Altheim wrote:
I thought I'd have a go at the
Lucene logo, not to change it markedly but clean it up so that it
is based on an existing font. This potential Lucene logo is based
on an ITC font called Magneto Bold Extended, which you can see here:
http://www.identifont.com/show?72W
I
Daniel Naber wrote:
I'm aware that the Wildcard name won't
fit well anymore, suggestions for a better name are welcome.
Expanded?
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Christoph Goller wrote:
I think we should change BooleanScorer. An easy way would be to sort the
bucket
list before it is used. Do you think that would affect performance
dramatically?
I think it would make it slower.
Otherwise we should reimplement BooleanScorer. I haven't looked into the
Christoph Goller wrote:
Doug, could you please move api/ to api.old/ and api.new/ to api/
Done.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Christoph Goller wrote:
I think i should finally make Release 1.4.3.
Great!
I presume the default.properties does no longer exist. I just fill in
1.4.3 as version in the build.xml before building it. Is this ok?
I build releases with something like:
ant -Dversion=1.4.3 clean dist
So that it
Guillermo Payet wrote:
The fact that Lucene stores and indexes (or seems it seems) all terms
as Strings and that there is no NumericTerm makes me think that I
might be missing something and that this migh be a much bigger deal
than I think?
You could write a HitCollector that uses
Erik Hatcher wrote:
On Oct 20, 2004, at 12:14 PM, Doug Cutting wrote:
The advantages of a zero-character prefix default are that it's
back-compatibile and that it will find more matches, when spelling
differences are in the first characters.
I prefer this default.
Anyone using QueryParser needs
Chuck Williams wrote:
However, I'm not sure this analysis is completely correct due to MultiSearcher.docFreq() which appears to be trying to redefine the tf's to be the global value across all indices. It wasn't clear to me how this code is ever reached, e.g. from TermQuery -- SegmentTermDocs.
Dan Climan wrote:
TermEnum terms = ir.terms();
int numTerms = 0;
while (terms.next())
{
Term t = terms.term();
if (t.field().equals(FullText))
numTerms++;
}
Jonathan Hager wrote:
Nate Denning encountered the following error when trying to load a
large (greater than 2147483647 bytes) index into a RAMDirectory. The
server has 12GB of memory, so loading it into memory should not be a
problem.
Have you instead tried copying the index to a ramfs ('mount
Bernhard Messer wrote:
Christoph Goller wrote:
Bernhard Messer wrote:
Currently there are 3 different methods available to get the field
names from an index.
a) getFieldNames();
b) getFieldNames(boolean indexed);
c) getIndexedFieldNames(boolean storedTermVector);
my proposal is to deprecate a),
Chuck Williams wrote:
That's a good point on how the standard vector space inner product
similarity measure does imply that the idf is squared relative to the
document tf. Even having been aware of this formula for a long time,
this particular implication never occurred to me. Do you know if
+1
Christoph Goller wrote:
I would like to propose Bernhard as Lucene committer.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Daniel Naber wrote:
On Tuesday 12 October 2004 17:22, Doug Cutting wrote:
Which is worse: a person who searches for Photokopie~ in a 1000 document
collection does not find documents containing Fotokopie; or a person who
searches for Photokopie~ in a 1M document collection doesn't find
anything
Daniel Naber wrote:
Searching for Photokopie~ on a 230,000 document corpus takes 2.3 seconds here
(AMD Athlon 2600+; other fuzzy terms get similar performance). As the number
of terms doesn't increase so fast with more documents, it will not take 10
seconds for 1 million documents. So fuzzy
Christoph Goller wrote:
With the current scorer API one could get rid of buckettable and
advance all subscores only by one document each time. I am not sure
whether the bucketable implementation is really much more efficient.
I only see the advantage of inlining some of the scorer.next and
Paul Elschot wrote:
I have a DisjunctionScorer based on a PriorityQueue lying around,
but I can't benchmark it myself at the moment. In case there is
interest, I'll gladly adapt it to org.apache.lucene.search and
add it in bugzilla.
This should look a lot like SpanOrQuery.getSpans().
On a related
Paul Elschot wrote:
Did you see my IDF question at the bottom of the original note? I'm
really curious why the square of IDF is used for Term and Phrase
queries, rather than just IDF. It seems like it might be a bug?
I missed that.
It has been discussed recently, but I don't remember the
Chuck Williams wrote:
I think there are at least two bugs here:
1. idf should not be squared.
I discussed this in a separate message. It's not a bug.
2. explain() should explain the actual reported score().
This is mostly a documentation bug in Hits. The normalization of scores
to 1.0 is
Chuck Williams wrote:
The issue is this. Imagine you have two fields, title and document,
both of which you want to search with simple queries like: albino
elephant. There are two general approaches, either a) create a combined
field that concatenates the two individual fields, or b) expand the
Andi Vajda wrote:
This code is generated by JavaCC. I think the best way to fix this
would be to fixup the code automatically whenever it is regenerated.
So, instead of patching QueryParser.java, patch build.xml. In the
javacc-QueryParser task, add a replace task which replaces
'jj_la1_0()'
Chuck Williams wrote:
That approach does not work. I could not find an approach that would
work with the built-in classes, although of course there might be one.
The problem has two components: coord and the fact that BooleanQuery's
sum their clause scores to compute the final score. The latter
Daniel Naber wrote:
The web page is updated now, could you please re-check if it's correct? I
added that information so that the Lucene = 1.4 format is still there.
We should note that when compression is enabled, gzip is used.
Also, byte[] is not a type defined in the file. In the formalism
Daniel Naber wrote:
-It is the only change so far that we cannot express in the API, i.e. we
cannot just deprecate a method to make Lucene's users aware of this. So we
can only list it in CHANGES.txt, where some people will surely miss it.
We could define a new query parser class with the new
Christoph Goller wrote:
Since 1.4.2 is already out, we would have to make a version 1.4.3.
OK, one more vote needed :-)
I'm okay with a 1.4.3 release for this.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional
Daniel Naber wrote:
I copied PhrasePrefixQuery to MultiPhraseQuery, decprecating
PhrasePrefixQuery. The wiki also suggests to make MultipleTermPositions a
private nested class. However, it is public currently so I wonder whether
we can just remove/deprecate it without offering an alternative.
Daniel Naber wrote:
I agree that the default should stay 0, even for Lucene 2.0.
It should certainly stay zero for 1.4.x releases.
However 2.0 is our opportunity to make incompatible changes. What is
the best default for this, that will work well for the most applications?
Does anyone have
[EMAIL PROTECTED] wrote:
goller 2004/10/11 06:36:14
Modified:src/java/org/apache/lucene/queryParser Tag: lucene_1_4_2_dev
QueryParser.java QueryParser.jj
[ ... ]
+ * @deprecated use [EMAIL PROTECTED] #getFieldQuery(String, String)}
Should these be deprecated
Erik Hatcher wrote:
It would be nice if the Sandbox components were versioned and released
along with the core - perhaps this would be a sufficient enough
solution? But, alas, I have no free time currently to devote to this
effort.
That's precisely the reason to add these to the main CVS
Otis Gospodnetic wrote:
I like this idea. I don't care so much about 1 or more CVS
repositories, as much as separate Jars, so if we can make
analyzers-1.4.2.jar and highlighter-1.4.2.jar along lucene-1.4.2.jar,
that would be ideal, in my opinion.
A minor point: we should prefix all the jar file
Andi Vajda wrote:
Do you intend to ultimately support Java Lucene with GCJ ?
As far as possible...
I'm down to 3 patches:
Can you please file a Lucene bug report and attach these patches? I'm
not guaranteeing that they'll all be committed right away, but rather
that that's a better place to
I just copied the 1.4.2 jar there.
Doug
Otis Gospodnetic wrote:
Here is the email I mentioned earlier on lucene-dev.
--- Brian McCallister [EMAIL PROTECTED] wrote:
To: [EMAIL PROTECTED]
From: Brian McCallister [EMAIL PROTECTED]
Subject: Maven Repo
Date: Thu, 26 Aug 2004 19:59:50 -0400
Hi all,
Daniel Naber wrote:
On Friday 01 October 2004 23:57, Doug Cutting wrote:
It is not mirrored yet. Erik's the only one who has ever done that.
Erik, do you have time to mirror 1.4.2? Thanks.
BTW, the release on the official download pages is still 1.4-final:
http://jakarta.apache.org/site
Christoph Goller wrote:
I would never
have guessed that calling the constructor there could make such a
difference.
The improvement is greatest for OR queries that contain a common term,
i.e., which match a large portion of the collection. However for, e.g.,
most phrase searches and AND
Christoph Goller wrote:
Items 4 and 5 don't seem that important to me. As far as I am
concerned we can leave them out.
When did 4 happen? Was it a rare or common problem?
I agree that we don't need to put 5 in 1.4.2.
So the only thing missing is your
optimization. Then 1.4.2 should be ready.
I
Paul Elschot wrote:
I'm working on a memory mapped directory that uses multiple buffers
for large files.
Great!
There will be a small performance hit, as each call to readByte() will
need to first check whether it's overflowed the current buffer, right?
While trying some test runs I found that
The new release is up at http://jakarta.apache.org/lucene/.
It is not mirrored yet. Erik's the only one who has ever done that.
Erik, do you have time to mirror 1.4.2? Thanks.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
Andi Vajda wrote:
You ask if this makes sense. No, not really. I don't know the details of
the purpose of the compound file implementation so this may be my problem.
The purpose of the compound file implementation is to minimize the
number of open files that an IndexReader must keep open.
Christoph Goller wrote:
I'd like the changes on FuzzyQuery, PhraseQuery, and PhrasePrefixQuery
included in the branch. Any objections?
I'm okay with these, but the primary purpose of 1.4.2 should be to
stabilize things, not to add new features. So let's be very selective
about what we add, and
Andi Vajda wrote:
So, my question: why is the compound file storage implemented in such an
orthogonal to Directory way instead of just being another Directory
implementation called FSCompoundFileDirectory ?
To combine the files of a segment we need to know when the segment was
complete. So a
Daniel Naber wrote:
On Monday 20 September 2004 18:49, Doug Cutting wrote:
To be clear, you are proposing that we branch from the 1.4.1 tag in CVS
and re-apply the patches below?
Yes, exactly.
Now that we have a patch for the memory leak problem, should we start a
1.4.2 branch?
Doug
Daniel Naber wrote:
I can try to do some of the work, but I'd need detailed instructions for
branching and tagging. It's probably easier/better if you do those parts.
I've never branched with CVS before either... so here goes!
I've added a branch called lucene_1_4_2_dev. To get a copy, use:
cvs
Doug Cutting wrote:
Still to do:
1. Replace OutputStream with IndexOutput and BufferedIndexOutput. This
is not critical and mostly for consistency, as mmap makes more sense for
read-only data.
2. Update RAMDirectory and FSDirectory to no longer use deprecated
classes. This is done last
[EMAIL PROTECTED] wrote:
Added: src/java/org/apache/lucene/store MMapDirectory.java
Log:
Add an nio mmap based Directory implementation.
For my simple benchmarks this is somewhat slower than the classic
FSDirectory, but I thought it was still worth having. It should use
less memory
Bruce Ritchie wrote:
[EMAIL PROTECTED] wrote:
One downside
is that it cannot handle indexes with files larger than 2^31 bytes.
Can you expand slightly on what causes this limitation and whether it still exists on 64 bit hardware?
This is a limit of the nio ByteBuffer API, which uses int instead
Daniel Naber wrote:
I'm using gcc/gcj 3.3.3, do I maybe need a more recent version?
I'm currently using 3.4.1, but I think 3.4.0 will work as well. I had
troubles with 3.3.
I've worked more on this, and now have a version (not yet committed)
which appears a bit faster than a JVM. More soon.
1 - 100 of 407 matches
Mail list logo