[
https://issues.apache.org/jira/browse/LUCENE-857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated LUCENE-857:
Attachment: LUCENE-857.refactoring-approach.diff
An example of what I'm thinking would make sense from a ba
[
https://issues.apache.org/jira/browse/LUCENE-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487679
]
Hoss Man commented on LUCENE-857:
-
I don't think it's a question of being careless about reading the Changelog --
I
[
https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487675
]
Marvin Humphrey commented on LUCENE-584:
DisjunctionSumScorer (the ORScorer) actually calls Scorer.score() on
[
https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487674
]
Otis Gospodnetic commented on LUCENE-584:
-
A. I'll look at the patch again tomorrow and follow what you
[
https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487667
]
Doron Cohen commented on LUCENE-584:
> No Scorer, no BooleanScorer(2), no ConjunctionScorer...
Thanks, I was re
[
https://issues.apache.org/jira/browse/LUCENE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Miller updated LUCENE-794:
---
Attachment: spanhighlighter5.patch
Apologize for the delay on this -- I was pulled into a busy produc
A memory saving optimization would be to not load the corresponding
String[] in the string index (as discussed previously), but there is
currently no way to tell the FieldCachethat the strings are unneeded.
The String values are only needed for merging results in a
MultiSearcher.
Yep, which hap
On 4/9/07, jian chen <[EMAIL PROTECTED]> wrote:
But, on a higher level, my idea is really just to create an array of
integers for each sort field. The array length is NumOfDocs in the index.
Each integer corresponds to a displayable string value. For example, if you
have a field of different colo
Hi, Paul,
I think to warm-up or not, it needs some benchmarking for specific
application.
For the implementation of the sort fields, when I talk about norms in
Lucene, I am thinking we could borrow the same implmentation of the norms to
do it.
But, on a higher level, my idea is really just to c
In our application, we have to sync up the index pretty frequently,
the
warm-up of the index is killing it.
Yep, it speeds up the first sort, but at the cost of making all the
others slower (maybe significantly so). That's obviously not ideal
but could make use of sorts in larger index
Hi, Paul,
Thanks for your reply. For your previous email about the need for disk based
sorting solution, I kind of agree about your points. One incentive for your
approach is that we don't need to warm-up the index anymore in case that the
index is huge.
In our application, we have to sync up th
[
https://issues.apache.org/jira/browse/LUCENE-859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487644
]
Yonik Seeley commented on LUCENE-859:
-
> Though it might still be handy to have something with main() that spits
Paul Smith wrote:
I don't disagree with the premise that it involves substantial I/O and
would increase the time taken to sort, and why this approach shouldn't
be the default mechanism, but it's not too difficult to build a disk I/O
subsystem that can allocate many spindles to service this and
Now, if we could use integers to represent the sort field values,
which is
typically the case for most applications, maybe we can afford to
have the
sort field values stored in the disk and do disk lookup for each
document
matched? The look up of the sort field value will be as simple as
On 10/04/2007, at 4:18 AM, Doug Cutting wrote:
Paul Smith wrote:
Disadvantages to this approach:
* It's a lot more I/O intensive
I think this would be prohibitive. Queries matching more than a
few hundred documents will take several seconds to sort, since
random disk accesses are requir
[
https://issues.apache.org/jira/browse/LUCENE-859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487640
]
Otis Gospodnetic commented on LUCENE-859:
-
Though it might still be handy to have something with main() that
[
https://issues.apache.org/jira/browse/LUCENE-859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic closed LUCENE-859.
---
Resolution: Won't Fix
Lucene Fields: [New, Patch Available] (was: [Patch Available, Ne
[
https://issues.apache.org/jira/browse/LUCENE-859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487635
]
Yonik Seeley commented on LUCENE-859:
-
Isn't this redundant with existing IndexReader methods?
deletedDocs() ==
Hi, Doug,
I have been thinking about this as well lately and have some thoughts
similar to Paul's approach.
Lucene has the norm data for each document field. Conceptually it is a byte
array with one byte for each document field. At query time, I think the norm
array is loaded into memory the fir
[
https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487631
]
Otis Gospodnetic commented on LUCENE-584:
-
Doron: just to address your question from Apr/7 - I expect/hope to
[
https://issues.apache.org/jira/browse/LUCENE-859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated LUCENE-859:
Attachment: LUCENE-859
El patcho.
> Expose the number of deleted docs in index/segment
>
Expose the number of deleted docs in index/segment
--
Key: LUCENE-859
URL: https://issues.apache.org/jira/browse/LUCENE-859
Project: Lucene - Java
Issue Type: New Feature
Components:
[
https://issues.apache.org/jira/browse/LUCENE-848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487617
]
Doron Cohen commented on LUCENE-848:
Seems okay to me (since it's all in the benchmark).
> Add supported for Wik
[
https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487616
]
Doron Cohen commented on LUCENE-584:
> > When you rerun, you may want to use my alg - to compare the two approach
[
https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487613
]
Mike Klaas commented on LUCENE-584:
---
Instead of discarding the first run, the approach I usually take is to run 3-4
[
https://issues.apache.org/jira/browse/LUCENE-848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487609
]
Steven Parkes commented on LUCENE-848:
--
That's what I meant (and did).
If it's okay, I'll bundle it into 848.
[
https://issues.apache.org/jira/browse/LUCENE-848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487608
]
Doron Cohen commented on LUCENE-848:
> Also, I was going to add support to the algorithm format for setting max
[
https://issues.apache.org/jira/browse/LUCENE-848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487600
]
Steven Parkes commented on LUCENE-848:
--
By the way, that's a rough patch. I'm cleaning it up as I use it to test
[
https://issues.apache.org/jira/browse/LUCENE-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andy Liu updated LUCENE-855:
Attachment: TestRangeFilterPerformanceComparison.java
Here's my new benchmark.
> MemoryCachedRangeFilter t
[
https://issues.apache.org/jira/browse/LUCENE-855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487595
]
Andy Liu commented on LUCENE-855:
-
In your updated benchmark, you're combining the range filter with a term query
th
[
https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487594
]
Yonik Seeley commented on LUCENE-584:
-
> When you rerun, you may want to use my alg - to compare the two approach
[
https://issues.apache.org/jira/browse/LUCENE-848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steven Parkes updated LUCENE-848:
-
Attachment: LUCENE-848.txt
This patch is a first cut a wikipedia benchmark support. It downloads
[
https://issues.apache.org/jira/browse/LUCENE-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic reassigned LUCENE-855:
---
Assignee: Otis Gospodnetic
> MemoryCachedRangeFilter to boost performance of Range qu
[
https://issues.apache.org/jira/browse/LUCENE-855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487590
]
Otis Gospodnetic commented on LUCENE-855:
-
Comments about the patch so far:
Cosmetics:
- You don't want to re
Otis Gospodnetic wrote:
I'd advise against calling optimize() at all in an environment whose indices
are constantly updated.
+1
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECT
Paul Smith wrote:
Disadvantages to this approach:
* It's a lot more I/O intensive
I think this would be prohibitive. Queries matching more than a few
hundred documents will take several seconds to sort, since random disk
accesses are required per matching document. Such an approach is only
The idea is to efficiently get the desired result set (top N) at once
without having to re-run different queries inside the application
logic. Query relaxation avoids having several round trips and possibly
could be offered with and without deduplication. Maybe this is a
feature required for Solr
[
https://issues.apache.org/jira/browse/LUCENE-855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487587
]
Otis Gospodnetic commented on LUCENE-855:
-
OK. I'll wait for the new performance numbers before committing.
[
https://issues.apache.org/jira/browse/LUCENE-853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic resolved LUCENE-853.
-
Resolution: Fixed
Lucene Fields: [New, Patch Available] (was: [Patch Available, Ne
[
https://issues.apache.org/jira/browse/LUCENE-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Ericson updated LUCENE-855:
Attachment: FieldCacheRangeFilter.patch
This version will create a real BitSet() when cloned and wi
Not that I know of. One typically puts that in application logic and re-runs
or offers to run alternative queries. No de-duping there, unless you do it in
your app. I think one problem with the described approach and Lucene would be
that Lucene's scores are not "absolute".
Otis
. . . . . .
link from Lucene web page to API docs
-
Key: LUCENE-858
URL: https://issues.apache.org/jira/browse/LUCENE-858
Project: Lucene - Java
Issue Type: Improvement
Reporter: Daniel Naber
Assi
Hi Daniel,
Can you file this as an issue and assign it to me? Nigel and I are
working through a few things w/ Hudson and the docs, still. The gist
of it is that the API and website will be put back on people.a.o.
This will mean that a relative link like api/overview-
summary.html#overvi
Has anyone within the Lucene or Solr community attempted to code a
progressive query relaxation technique similar to the one described
here for Oracle Text?
http://www.oracle.com/technology/products/text/htdocs/prog_relax.html
Thanks,
-- J.D.
44 matches
Mail list logo