Hi,
That was part of a project. During the time I was working on, there were
plans to make it publicly available. I don't know what happened after I
left.
Well, in the simplest sense there are three steps to follow:
-Remove double/float numbers, replace them by functions that would be
performed
The TermFreqVector.getTermFrequencies always return the same value as
TermDocs.freq, even if a field was set not to be added to the term
frequency vector.
Is this really the way it should be? It makes the fields even more
confusing than my prior post on the subject.
doc.add(new Field("foo", "b
I created an index with more than 30,000 text files.
I used indexExists() to determine either to create a new index or to add
docs to the existing index.
But when the num of docs in the index was over 3,000 (sometimes 3,400,
sometimes 3,200), the indexExists function returns false, so I ended up
This happens sometimes when number of docs is over 2000. So it's kinda of
random.
Wenjie
On 5/8/06, wenjie zheng <[EMAIL PROTECTED]> wrote:
I created an index with more than 30,000 text files.
I used indexExists() to determine either to create a new index or to add
docs to the existing index.
Hi,
where can I get the latest jar with org.apache.lucene.store.db.DbDirectory
class?
thanks.
Olivier.
DISCLAIMER: This message may contain confidential information or privileged
material and is intended only for the individual(s) named. If you are not a
named addressee and mistakenly recei
It is in DocumentWriter.java class.
Look at writePostings(...) method.
Here are the lines:
// add an entry to the freq file
int f = posting.freq;
if (f == 1) // optimize freq=1
freq.writeVInt(1); // set low bit of doc num.
else {
freq.writeVIn
Looking at your email again.
You are confusing the initial writing of postings with the segment merging.
Once the doc number is written, the .frq file is not changed. The segment
merge process will write to a new .frq file.
Make sense?
Jian
On 5/8/06, jian chen <[EMAIL PROTECTED]> wrote:
It
[
http://issues.apache.org/jira/browse/LUCENE-510?page=comments#action_12378519 ]
Marvin Humphrey commented on LUCENE-510:
The following patch...
* Changes Lucene to use bytecounts as the prefix to all written Strings
* Changes Lucene to write s
[ http://issues.apache.org/jira/browse/LUCENE-510?page=all ]
Marvin Humphrey updated LUCENE-510:
---
Attachment: strings.diff
> IndexOutput.writeString() should write length in bytes
> --
>
> Ke
--- "Marvin Humphrey (JIRA)" <[EMAIL PROTECTED]> wrote:
...
> It also slows Lucene down -- indexing takes around a
> 20% speed hit. It would be possible to submit a
> patch which had a smaller impact on performance, but
> this one is already over 700 lines long, and it's
> goal is to achieve stan
Today, applications have to open/close an IndexWriter and open/close an
IndexReader directly or indirectly (via IndexModifier) in order to handle a
mix of inserts and deletes. This performs well when inserts and deletes
come in fairly large batches. However, the performance can degrade
dramaticall
On Sun, 2006-05-07 at 12:00 +0200, karl wettin wrote:
>
> doc.add(new Field("foo", "bar", Store.NO, Index.TOKENIZED,
> TermVector.YES));
> doc.add(new Field("foo", "bar", Store.NO, Index.TOKENIZED,
> TermVector.NO));
>
> Vector frequency of [foo, bar] is 2. I would expect it to be 1 or a
> field
This sounds very promising. Can you please attach it to a bug in Jira?
Thanks!
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
I will create a bug in Jira.
Let me try to attach the two files here again.
(See attached file: IndexWriter.changed)(See attached file:
TestWriterDelete.changed)
Regards,
Ning
Ning Li
Search Technologies
IBM Almaden Research Center
650 Harry Road
San Jose, CA 95120
|-+-
Supporting deleteDocuments in IndexWriter (Code and Performance Results
Provided)
-
Key: LUCENE-565
URL: http://issues.apache.org/jira/browse/LUCENE-565
Project: Lucene - Java
Type: Bug
[ http://issues.apache.org/jira/browse/LUCENE-565?page=all ]
Ning Li updated LUCENE-565:
---
Attachment: IndexWriter.java
TestWriterDelete.java
> Supporting deleteDocuments in IndexWriter (Code and Performance Results
> Provided)
> --
Hi,
Not sure if people caught my question over on java-user@ about the possibility
of eliminating floating point calculations from Lucene's scoring. Before I
embark on this, I thought I'd ask:
- Am I crazy? Is this at all doable?
- Is this doable without forking and maintaining my own patches
[
http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12378557 ]
Doug Cutting commented on LUCENE-565:
-
Can you please attach diffs rather than complete files? The diffs should not
not contain CHANGE comments. To generate diffs, check
Doug Cutting (JIRA) wrote:
Can you please attach diffs rather than complete files? The diffs should not not
contain CHANGE comments. To generate diffs, check Lucene out of Subversion, make
your changes, then, from the Lucene trunk, run something like 'svn diff >
my.patch'. New files should
: One of the reasons I am looking at this is because I often need just
: yes/no (matches/doesn't match) answers, and don't care for the score.
I didn't realize that was an option -- i thought you wanted integer
scoring, and the best advice i had for that was to search and replace.
But if you jus
20 matches
Mail list logo