[
https://issues.apache.org/jira/browse/LUCENE-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678045#action_12678045
]
Ning Li commented on LUCENE-1541:
-
An index size comparison will be great.
> Tri
[
https://issues.apache.org/jira/browse/LUCENE-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675390#action_12675390
]
Ning Li commented on LUCENE-1541:
-
When one precision step is given, it is converte
[
https://issues.apache.org/jira/browse/LUCENE-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675212#action_12675212
]
Ning Li commented on LUCENE-1541:
-
If you are *really* concerned with the additional
Components: contrib/*
Reporter: Ning Li
Priority: Minor
In the current trie range implementation, a single precision step is specified.
With a large precision step (say 8), a value is indexed in fewer terms (8) but
the number of terms for a range can be large. With a small
[
https://issues.apache.org/jira/browse/LUCENE-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674248#action_12674248
]
Ning Li commented on LUCENE-1470:
-
Agree. Do you want to open a new issue? If you wan
[
https://issues.apache.org/jira/browse/LUCENE-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674051#action_12674051
]
Ning Li commented on LUCENE-1470:
-
Hi Uwe,
I had something similar in mind when I
[
https://issues.apache.org/jira/browse/LUCENE-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673912#action_12673912
]
Ning Li commented on LUCENE-1470:
-
Good stuff!
Is it worth to also have an optio
LUCENE-1335 is not listed in CHANGES.txt? It also includes a minor
behavior change: "no longer allow the same Directory to be passed into
addIndexes* more than once".
Cheers,
Ning
On Thu, Sep 18, 2008 at 2:29 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> I just created the first
>>> Even so,
>>> this may not be sufficient for some FS such as HDFS... Is it
>>> reasonable in this case to keep in memory everything including
>>> stored fields and term vectors?
>>
>> We could maybe do something like a proxy IndexInput/IndexOutput that
>> would allow updating the read buffer fro
On Mon, Sep 8, 2008 at 4:23 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>> I thought an index reader which supports real-time search no longer
>> maintains a static view of an index?
>
> It seems advantageous to just make it really cheap to get a new view
> of the index (if you do it for every sear
On Mon, Sep 8, 2008 at 2:43 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> But, how would you maintain a static view of an index...?
>
> IndexReader r1 = indexWriter.getCurrentIndex()
> indexWriter.addDocument(...)
> IndexReader r2 = indexWriter.getCurrentIndex()
>
> I assume r1 will have a view of
Hi,
We experimented using HBase's scalable infrastructure to scale out Lucene:
http://www.mail-archive.com/[EMAIL PROTECTED]/msg01143.html
There is the concern on the impact of HDFS's random read performance
on Lucene search performance. And we can discuss if HBase's architecture
is best for scal
[
https://issues.apache.org/jira/browse/LUCENE-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628025#action_12628025
]
Ning Li commented on LUCENE-532:
Is the use of seek and write in ChecksumIndexOu
+1
On Thu, Aug 28, 2008 at 8:19 PM, Michael McCandless (JIRA)
<[EMAIL PROTECTED]> wrote:
>
>[
> https://issues.apache.org/jira/browse/LUCENE-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626805#action_12626805
> ]
>
> Michael McCandless commen
[
https://issues.apache.org/jira/browse/LUCENE-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626158#action_12626158
]
Ning Li commented on LUCENE-1335:
-
Maybe this should be a separate JIRA issue. In do
[
https://issues.apache.org/jira/browse/LUCENE-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12625455#action_12625455
]
Ning Li commented on LUCENE-1335:
-
> I don't think so: with autoCommit=true
[
https://issues.apache.org/jira/browse/LUCENE-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12625078#action_12625078
]
Ning Li commented on LUCENE-1335:
-
> It's because commit() calls prepareCom
[
https://issues.apache.org/jira/browse/LUCENE-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624998#action_12624998
]
Ning Li commented on LUCENE-1335:
-
I agree that we should not make any API promises a
[
https://issues.apache.org/jira/browse/LUCENE-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624851#action_12624851
]
Ning Li commented on LUCENE-1335:
-
Hi Mike, could you update the patch? I cannot appl
[
https://issues.apache.org/jira/browse/LUCENE-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ning Li resolved LUCENE-1338.
-
Resolution: Invalid
When deprecated constructors are removed in 3.0, autoCommit will always be
false
[
https://issues.apache.org/jira/browse/LUCENE-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614404#action_12614404
]
Ning Li commented on LUCENE-1338:
-
Or is the intention to make autoCommit always f
Java
Issue Type: Bug
Components: Index
Reporter: Ning Li
Priority: Minor
With non-deprecated constructors, IndexWriter's autoCommit is always true.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to
think there're similar problems with calling optimize() while addIndexes
> is in progress... I think we should disallow that?
Optimize waits for addIndexes to finish? I think it's useful to allow addIndexes
during maybeMerge and optimize, no?
Cheers,
Ning Li
---
Hi,
Should we guard against the case when commit() is called during addIndexes?
Otherwise, errors such as a file does not exist could happen during commit.
Cheers,
Ning Li
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For
[
https://issues.apache.org/jira/browse/LUCENE-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12578518#action_12578518
]
Ning Li commented on LUCENE-1228:
-
Does SegmentInfos really need both "ver
[
https://issues.apache.org/jira/browse/LUCENE-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ning Li updated LUCENE-1035:
Attachment: LUCENE-1035.contrib.patch
Re-do as a contrib package. Creating BufferPooledDirectory with
[
https://issues.apache.org/jira/browse/LUCENE-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575782#action_12575782
]
Ning Li commented on LUCENE-1204:
-
> I think this is a false alarm.
I just found
[
https://issues.apache.org/jira/browse/LUCENE-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12574782#action_12574782
]
Ning Li commented on LUCENE-1035:
-
> It looks like this was never fully done. I wo
[
https://issues.apache.org/jira/browse/LUCENE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12572957#action_12572957
]
Ning Li commented on LUCENE-1194:
-
> As of LUCENE-1044, when autoCommit=true, Inde
[
https://issues.apache.org/jira/browse/LUCENE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12572576#action_12572576
]
Ning Li commented on LUCENE-1194:
-
Great to see deleteByQuery being added to IndexWr
One main focus is to provide fault-tolerance in this distributed index
system. Correct me if I'm wrong, I think SOLR-303 is focusing on merging
results from multiple shards right now. We'd like to start an open source
project for a fault-tolerant distributed index system (or join if one
already exi
No. I'm curious too. :)
On Feb 6, 2008 11:44 AM, J. Delgado <[EMAIL PROTECTED]> wrote:
> I assume that Google also has distributed index over their
> GFS/MapReduce implementation. Any idea how they achieve this?
>
> J.D.
>
I work for IBM Research. I read the Rackspace article. Rackspace's Mailtrust
has a similar design. Happy to see an existing application on such a system.
Do they plan to open-source it? Is the AOL project an open source project?
On Feb 6, 2008 11:33 AM, Clay Webster <[EMAIL PROTECTED]> wrote:
>
>
HDFS block. This feature may be useful for other HDFS applications (e.g.,
HBase). We would like to collaborate with other people who are interested in
adding this feature to HDFS.
Regards,
Ning Li
> That may be a little too seamless. We want the user to have specific
> control over which fields are efficiently stored separately since they
> will know how that field will be used.
Maybe let users decide field families, like the column families in BigTable?
--
[
https://issues.apache.org/jira/browse/LUCENE-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538638
]
Ning Li commented on LUCENE-1035:
-
> The question is whether such situations are common enough to warrant add
[
https://issues.apache.org/jira/browse/LUCENE-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538129
]
Ning Li commented on LUCENE-1035:
-
> That seems like quite a few docs to retrieve--any particular reason why?
[
https://issues.apache.org/jira/browse/LUCENE-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538112
]
Ning Li commented on LUCENE-1035:
-
> I'll change to "OR" queries and see what happens.
Quer
[
https://issues.apache.org/jira/browse/LUCENE-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537995
]
Ning Li commented on LUCENE-1035:
-
> most lucene usecases store much more than just the document id... that wo
[
https://issues.apache.org/jira/browse/LUCENE-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537978
]
Ning Li commented on LUCENE-1035:
-
> Were the tests run using the same set of queries they were warmed for?
[
https://issues.apache.org/jira/browse/LUCENE-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537972
]
Ning Li commented on LUCENE-1035:
-
> I don't think this is any better than the NIOFileCache directo
[
https://issues.apache.org/jira/browse/LUCENE-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ning Li updated LUCENE-1035:
Summary: Optional Buffer Pool to Improve Search Performance (was: ptional
Buffer Pool to Improve Search
[
https://issues.apache.org/jira/browse/LUCENE-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ning Li updated LUCENE-1035:
Lucene Fields: [Patch Available] (was: [New])
> ptional Buffer Pool to Improve Search Performa
[
https://issues.apache.org/jira/browse/LUCENE-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ning Li updated LUCENE-1035:
Attachment: LUCENE-1035.patch
Coding Changes
--
New classes are localized to the store
: Store
Reporter: Ning Li
Index in RAMDirectory provides better performance over that in FSDirectory.
But many indexes cannot fit in memory or applications cannot afford to
spend that much memory on index. On the other hand, because of locality,
a reasonably sized buffer pool may
lt set is large. But loading it in
> memory when opening index can also be slow if the index is large and updates
> often.
>
> Thanks
>
> -John
>
> On 10/18/07, Ning Li <[EMAIL PROTECTED]> wrote:
> >
> > Make all documents have a term, say "ID:UID",
Make all documents have a term, say "ID:UID", and for each document,
store its UID in the term's payload. You can read off this posting
list to create your array. Will this work for you, John?
Cheers,
Ning
On 10/18/07, Erik Hatcher <[EMAIL PROTECTED]> wrote:
> Forwarding this to java-dev per req
The cause is that in MergeThread.run(), merge in the try block is a
local variable, while merge in the catch block is the class variable.
Merge in the try block could be one different from the original merge,
but the catch block always checks the abort flag of the original
merge.
-
[
https://issues.apache.org/jira/browse/LUCENE-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12531513
]
Ning Li commented on LUCENE-1007:
-
One more thing about the approximation of actual bytes used for buffered delete
[
https://issues.apache.org/jira/browse/LUCENE-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ning Li updated LUCENE-1007:
Attachment: LUCENE-1007.take2.patch
Take2 counts buffered delete terms towards ram buffer used. A test
[
https://issues.apache.org/jira/browse/LUCENE-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ning Li updated LUCENE-1007:
Attachment: LUCENE-1007.patch
Just got around to do the patch:
- The patch includes changes to
Reporter: Ning Li
Priority: Minor
See discussion at http://www.gossamer-threads.com/lists/lucene/java-dev/53186
Provide the flexibility to turn on/off any flush triggers - ramBufferSize,
maxBufferedDocs and maxBufferedDeleteTerms. One of ramBufferSize and
maxBufferedDocs
On 9/24/07, Michael McCandless <[EMAIL PROTECTED]> wrote:
> On flushing pending deletes by RAM usage: should we just bundle this
> up under "flush by RAM usage"? Ie "when total RAM usage, either from
> buffered deletes, buffered docs, anything else, exceeds X then it's
> time to flush"? (Instead
to max int
MB.
Ning
On 9/24/07, Michael McCandless <[EMAIL PROTECTED]> wrote:
>
> "Doron Cohen" <[EMAIL PROTECTED]> wrote:
> > Hi Ning,
> >
> > "Ning Li" <[EMAIL PROTECTED]> wrote on 24/09/2007 00:26:36:
> >
> > > Do y
Hi Doron,
> On the other, the logic of "use memory-limit unless added-docs-limit was
> specified" seems somewhat confusing
The design intention is to use either
maxBufferedDocs/maxBufferedDeleteTerms or ramBufferSize, but not both
at the same time.
> (why only by pending adds, why not also by pe
[
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12527286
]
Ning Li commented on LUCENE-847:
> This was actually intentional: I thought it fine if the application is
> s
[
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12527239
]
Ning Li commented on LUCENE-847:
Hmm, it's actually possible to have concurrent merges with SerialMergeSche
[
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12527224
]
Ning Li commented on LUCENE-847:
Access of mergeThreads in ConcurrentMergeScheduler.merge() should be
synchronized
[
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12526628
]
Ning Li commented on LUCENE-847:
> OK, another rev of the patch (take6). I think it's close!
Yes, it
[
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12526029
]
Ning Li commented on LUCENE-847:
Comments on optimize():
- In the while loop of optimize
[
https://issues.apache.org/jira/browse/LUCENE-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525271
]
Ning Li commented on LUCENE-992:
The patch looks good! A few comments and/or observations:
- addDocument(Document
[
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12524084
]
Ning Li commented on LUCENE-847:
> Not quite following you here... not being eligible because the merge
>
[
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523957
]
Ning Li commented on LUCENE-847:
> True, but I was thinking CMPW could be an exception to this rule. I
> g
[
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523621
]
Ning Li commented on LUCENE-847:
I include comments for both LUCENE-847 and LUCENE-870 here since they are
closely
Hi Mike,
I cannot apply the patch cleanly. MergePolicy.java, e.g., seems to be
missing from the patch.
On 8/24/07, Michael McCandless (JIRA) <[EMAIL PROTECTED]> wrote:
>
> [
> https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
[
https://issues.apache.org/jira/browse/LUCENE-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ning Li updated LUCENE-987:
---
Attachment: deprecateIndexModifier.patch
> Deprecate IndexModif
Deprecate IndexModifier
---
Key: LUCENE-987
URL: https://issues.apache.org/jira/browse/LUCENE-987
Project: Lucene - Java
Issue Type: Test
Components: Index
Reporter: Ning Li
Priority
[
https://issues.apache.org/jira/browse/LUCENE-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ning Li updated LUCENE-978:
---
Attachment: Readers.patch
Similar fixes are added for FieldsReader and TermVectorsReader as well.
>
[
https://issues.apache.org/jira/browse/LUCENE-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12520286
]
Ning Li commented on LUCENE-978:
> Agreed. Actually, it also looks like we need to do something similar
[
https://issues.apache.org/jira/browse/LUCENE-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ning Li updated LUCENE-978:
---
Lucene Fields: [Patch Available] (was: [New])
> GC resources in TermInfosReader when exception occurs
[
https://issues.apache.org/jira/browse/LUCENE-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ning Li updated LUCENE-978:
---
Attachment: TermInfosReader.patch
> GC resources in TermInfosReader when exception occurs in its construc
Issue Type: Bug
Components: Index
Reporter: Ning Li
Priority: Minor
Attachments: TermInfosReader.patch
I replaced IndexModifier with IndexWriter in test case TestStressIndexing and
noticed the test failed from time to time because some .tis file is still
IndexWriter does everything IndexModifier does and more, except
"deleteDocument(int doc)". Can we reach consensus on: 1 Should we
deprecate IndexModifier before 3.0 and remove it in 3.0? 2 If so, do
we have to add "deleteDocument(int doc)" to IndexWriter?
We know how to support "deleteDocument(int
On 8/8/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On 8/8/07, Ning Li <[EMAIL PROTECTED]> wrote:
> > This reminds me: It'd be nice if we could support delete-by-query someday.
> > :)
> >
> > I was thinking people use deleteDocument(int docid) whe
On 8/8/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> Let's take a simple case of deleting documents in a range, like
> date:[2006 TO 2008]
> One would currently need to close the writer and open a new reader to
> ensure that they can "see" all the documents. Then execute a
> RangeQuery, collect th
On 8/8/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On 8/8/07, Ning Li <[EMAIL PROTECTED]> wrote:
> > But you still think it's worth to be included in IndexWriter, right?
>
> I'm not sure... (unless I'm missing some obvious use-cases).
> If one could g
On 8/8/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> To make delete by docid useful, one needs a way to *get* those docids.
> A callback after flush that provided acurrent list of readers for the
> segments would serve.
Interesting. That makes sense.
> I think IndexWriter.deleteDocument(int doc)
[
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12518520
]
Ning Li commented on LUCENE-847:
> Furthermore, I think this is all contained within IndexWriter, right?
> I
[
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12518486
]
Ning Li commented on LUCENE-847:
The following comments are about the impact on merge if we add
"deleteDocumen
ffered delete doc ids. I
don't think it should be the reason not to support "deleteDocument(int
doc)" in IndexWriter. But its impact on concurrent merge is a concern.
Ning
On 8/7/07, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
> +1
>
>
> On Aug 7, 2007, at 3:37 PM, Nin
[
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12518453
]
Ning Li commented on LUCENE-847:
On 8/8/07, Michael McCandless (JIRA) <[EMAIL PROTECTED]> wrote:
> Actua
On 8/7/07, Steven Parkes (JIRA) <[EMAIL PROTECTED]> wrote:
>
> [
> https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12518210
> ]
>
> Steven Parkes commented on LUCENE-847:
> --
>
>
With the plan towards 3.0 release laid out, I think it's a good time
to deprecate IndexModifier and eventually remove IndexModifier.
The only method in IndexModifier which is not implemented in
IndexWriter is "deleteDocument(int doc)". This is because of the
concern that document ids are changing
[
https://issues.apache.org/jira/browse/LUCENE-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512271
]
Ning Li commented on LUCENE-938:
I didn't make myself clear. Let me try again. The patch includes two par
[
https://issues.apache.org/jira/browse/LUCENE-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510422
]
Ning Li commented on LUCENE-938:
Good catch, Steven!
One thing though: I thought we had assumed that there wouldn
Steve, Mike,
Thanks for the explanation! I meant cascading but wrote optimizing. So
it still cascades merges.
It would merge based on size (not # docs), would be free to merge
adjacent segments (not just rightmost segments), and would merge N
(configurable) at a time. The part that's still unc
Having the merge policy own segmentInfos sounds kind of hard to me.
Among other things, there's a lot of code in IndexWriter for managing
segmentInfos with regards to transactions. I'm pretty wary of touching
that code. Is there a way around that?
But conceptually, do you agree it's a good idea
On 3/23/07, Steven Parkes (JIRA) <[EMAIL PROTECTED]> wrote:
In fact, there a few things here that are fairly subtle/important. The relationship/protocol
between the writer and policy is pretty strong. This can be seen in the strawman concurrent
merge code where the merge policy holds state and
On 3/31/07, Michael McCandless (JIRA) <[EMAIL PROTECTED]> wrote:
Create merge policy that doesn't periodically inadvertently optimize
So we could make a small change to the policy by only merging the
first mergeFactor segments o
On 4/4/07, Michael McCandless (JIRA) <[EMAIL PROTECTED]> wrote:
Note that for "autoCommit=false", this optimization is somewhat less
important, depending on how often you actually close/open a new
IndexWriter. In the extreme case, if you open a writer, add 100 MM
docs, close the writer, then no
On 4/3/07, Michael McCandless (JIRA) <[EMAIL PROTECTED]> wrote:
* With term vectors and/or stored fields, the new patch has
substantially better RAM efficiency.
Impressive numbers! The new patch improves RAM efficiency quite a bit
even with no term vectors nor stored fields, because of the
FYI: Patch submitted in http://issues.apache.org/jira/browse/LUCENE-847.
Cheers,
Ning
"Here is a patch for concurrent merge as discussed in:
http://www.gossamer-threads.com/lists/lucene/java-dev/45651?search_string=concurrent%20merge;#45651
"I put it under this issue because it helps design and
It will be great to support early termination for top-K queries within
the DAAT query processing model in Lucene. There is quite some work
published in related areas.
http://portal.acm.org/citation.cfm?id=956944 is one of them.
Am I getting it right? If a query requires top-K results, isn't it
su
[
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ning Li updated LUCENE-847:
---
Attachment: concurrentMerge.patch
Here is a patch for concurrent merge as discussed in:
http://www.gossamer
On 3/26/07, Michael McCandless (JIRA) <[EMAIL PROTECTED]> wrote:
Ahhh, this is a very good point. OK I won't deprecate "flushing by
doc count" and instead will allow either "flush by RAM usage" (default
to this?) or "flush by doc count".
Just want to clarify: It's either "flush and merge by by
Hi Steven,
I haven't read the details, but should maxBufferedDocs be exposed in
some subinterfaces instead of the MergePolicy interface? Since some
policies may use it and others may use byte size or something else.
It's great that you've started on concurrent merge as well! I haven't
got a chan
On 3/22/07, Michael McCandless <[EMAIL PROTECTED]> wrote:
Yes the code re-computes the level of a given segment from the current
values of maxBufferedDocs & mergeFactor. But when these values have
changed (or, segments were flushed by RAM not by maxBufferedDocs) then
the way it computes level no
On 3/22/07, Michael McCandless <[EMAIL PROTECTED]> wrote:
Right I'm calling a newly created segment (ie flushed from RAM) level
0 and then a level 1 segment is created when you merge 10 level 0
segments, level 2 is created when merge 10 level 1 segments, etc.
That is not how the current merge p
Many good points! Thanks, guys!
When background merge is employed, document additions can
out-pace merging, no matter how many background merge threads
are used. Blocking has to happen at some point.
So, if we do anything, we make it simple. I agree with what
Robert and Yonik have proposed: docu
On 2/21/07, Doron Cohen (JIRA) <[EMAIL PROTECTED]> wrote:
Imagine the application and Lucene could talk, with the current
implementation we could hear something like this: ...
However, there could be multiple threads updating the same index. For
example, thread 1 deletes the term "id:5" twice,
1 - 100 of 202 matches
Mail list logo