Chris Hostetter wrote:
> : > If a given Tokenizer does not need to do any character
normalization (I
> : would think most wouldn't) is there any added cost during
tokenization with
> : this change?
> :
> : Thank you for your reply, Mike!
> : There is no added cost if Tokenizer doesn't need to c
[
https://issues.apache.org/jira/browse/LUCENE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648869#action_12648869
]
Tim Sturge commented on LUCENE-1461:
Here's some benchmark data to demonstrate the uti
[
https://issues.apache.org/jira/browse/LUCENE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Sturge updated LUCENE-1461:
---
Attachment: RangeMultiFilter.java
Constructs a virtual RangeFilter on top of an already existing
Di
[
https://issues.apache.org/jira/browse/LUCENE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Sturge updated LUCENE-1461:
---
Attachment: DisjointMultiFilter.java
Base code which builds the integer array.
> Cached filter for
Cached filter for a single term field
-
Key: LUCENE-1461
URL: https://issues.apache.org/jira/browse/LUCENE-1461
Project: Lucene - Java
Issue Type: New Feature
Reporter: Tim Sturge
These class
[
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648839#action_12648839
]
Michael Busch commented on LUCENE-1458:
---
{quote}
We could also explore something in-
[
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648835#action_12648835
]
Marvin Humphrey commented on LUCENE-1458:
-
> I'm not sure I'd trust the OS's IO ca
Change all contrib TokenStreams/Filters to use the new TokenStream API
--
Key: LUCENE-1460
URL: https://issues.apache.org/jira/browse/LUCENE-1460
Project: Lucene - Java
Issu
[
https://issues.apache.org/jira/browse/LUCENE-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Busch resolved LUCENE-1422.
---
Resolution: Fixed
Lucene Fields: [New, Patch Available] (was: [Patch Available, New])
> it'd be nice to genericize MultiLevelSkipListWriter so that it could index
arbitrary files
+1 on this idea. Using skip lists for the term index would be an
improvement.
On Tue, Nov 18, 2008 at 12:27 PM, Michael McCandless (JIRA) <[EMAIL PROTECTED]
> wrote:
>
>[
> https://issues.apache.org
On a side note, and I have not looked at the flexible indexing API enough to
know if there is some equivalent but are we moving to something like MG4J's
MutableString
http://mg4j.dsi.unimi.it/docs/it/unimi/dsi/mg4j/util/MutableString.htmlinstead
of java.lang.String objects?
On Tue, Nov 18, 2008 at
Nice! I'm looking at using PForDelta in creating the tag index type of
system. Do you think there is an elegant way to add realtime updates to
individual fields using the current (or future) flexible indexing API?
On Tue, Nov 18, 2008 at 2:11 PM, Michael McCandless (JIRA)
<[EMAIL PROTECTED]>wrot
[
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648781#action_12648781
]
Michael Busch commented on LUCENE-1458:
---
I'll look into this patch soon.
Just wante
[
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-1458:
---
Attachment: LUCENE-1458.patch
[Attached patch]
To test whether the new pluggable c
[
https://issues.apache.org/jira/browse/LUCENE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648768#action_12648768
]
Paul Smith commented on LUCENE-1342:
java version "1.6.0_10"
Java(TM) SE Runtime Envir
[
https://issues.apache.org/jira/browse/LUCENE-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Jones updated LUCENE-1459:
---
Attachment: caching-wrapper-filter.diff
Patch against 2.4.0 to be more careful about returning from
CachingWrapperFilter crashes if you call both bits() and getDocIdSet()
--
Key: LUCENE-1459
URL: https://issues.apache.org/jira/browse/LUCENE-1459
Project: Lucene - Java
Issu
[
https://issues.apache.org/jira/browse/LUCENE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648748#action_12648748
]
Michael McCandless commented on LUCENE-1342:
Just to confirm, it was at least
Why not create new lightweight references to the the directory, and
using WeakReferences and ReferenceQueues and avoid the need to
manually use incRef and decRef ?
Tracking state like this almost always leads to problems - this is
why Java has GC in the first place - because it is very diff
I think this makes sense.
But: I think we'd need to add incRef/decRef to Directory? And fix the
newly added logic in DirectoryIndexReader that now clones the dir
during reopen (because it's hardwired to only work with FSDir).
Mike
Mark Miller wrote:
Does anyone object to making IndexRe
: > If a given Tokenizer does not need to do any character normalization (I
: would think most wouldn't) is there any added cost during tokenization with
: this change?
:
: Thank you for your reply, Mike!
: There is no added cost if Tokenizer doesn't need to call correctOffset().
But every token
OK will do :) Nice that you're paying attention!
Mike
Andrzej Bialecki wrote:
Michael McCandless (JIRA) wrote:
[ https://issues.apache.org/jira/browse/LUCENE-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647960
#action_12647960 ] Michael Mc
[
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648739#action_12648739
]
Michael McCandless commented on LUCENE-1458:
bq. Can we design a format that a
It's not that it isn't required -- it's just that it stores less info
than before.
I changed the _X.tis format such that at each seekable point (every 128
terms by default), everything is written as absolutes (term text, freq
& prox offset). This means the _X.tii file only has to store the
index
[
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648727#action_12648727
]
Marvin Humphrey commented on LUCENE-1458:
-
The work on streamlining the term dicti
Michael,
Can you describe a bit more about why the term dictionary index is no longer
required?
Jason
On Tue, Nov 18, 2008 at 7:41 AM, Michael McCandless (JIRA)
<[EMAIL PROTECTED]>wrote:
>
> [
> https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetab
: Sorry that the build was broken for three days. I don't have a hudson account.
absolutely, 100%, not your fault ... i wasn't giving you a hard time at
all, i was just trying to goad the other PMC members to take a more active
role in maintaining our hudson setup.
: So I can't get one unless
[
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-1458:
---
Attachment: LUCENE-1458.patch
Woops, sorry... I was missing a bunch of files. Try t
[
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648613#action_12648613
]
Mark Miller commented on LUCENE-1458:
-
Hmmm...I think something is missing - FormatPo
Further steps towards flexible indexing
---
Key: LUCENE-1458
URL: https://issues.apache.org/jira/browse/LUCENE-1458
Project: Lucene - Java
Issue Type: New Feature
Components: Index
Affects Ve
[
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-1458:
---
Attachment: LUCENE-1458.patch
> Further steps towards flexible indexing
> --
[
https://issues.apache.org/jira/browse/LUCENE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved LUCENE-1453.
Resolution: Fixed
Fix Version/s: 2.9
Committed revision 718540. Thanks Mar
[
https://issues.apache.org/jira/browse/LUCENE-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved LUCENE-1456.
Resolution: Fixed
Fix Version/s: 2.9
Committed revision 718537.
Thanks Mar
[
https://issues.apache.org/jira/browse/LUCENE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648552#action_12648552
]
Michael McCandless commented on LUCENE-1453:
OK thanks for the review guys!
It is a bug, but it's in dead code that's never called. I'll remove
the code.
Mike
Michael Busch wrote:
see http://issues.apache.org/jira/browse/LUCENE-1456
Shai Erera wrote:
Hi
I looked at FieldInfo and found this line (95):
if (this.omitTf != omitTf) {
this.omitTf = true;
[
https://issues.apache.org/jira/browse/LUCENE-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless reassigned LUCENE-1456:
--
Assignee: Michael McCandless
> FieldInfo omitTerms bug
> -
see http://issues.apache.org/jira/browse/LUCENE-1456
Shai Erera wrote:
Hi
I looked at FieldInfo and found this line (95):
if (this.omitTf != omitTf) {
this.omitTf = true;// if one require omitTf at
least once, it remains off for life
}
Shouldn't it be:
if (
I'm almost sure this was not the expected logic.
Otherwise the "this.omitTf = true" statement will never be executed.
Based on code logic, it should probably be what you are saying: "this.omitTf
!= other.omitTf" instead of "this.omitTf omitTf" : )
Regards,
Adriano Crestani Campos
On Tue, Nov 18
Hi
I looked at FieldInfo and found this line (95):
if (this.omitTf != omitTf) {
this.omitTf = true;// if one require omitTf at least
once, it remains off for life
}
Shouldn't it be:
if (this.omitTf != other.omitTf) {
this.omitTf = true;// i
39 matches
Mail list logo