Duh, I didn't even notice we also had an existing ArrayUtilTest
yes please post a patch!
Mike
On Thu, Jan 21, 2010 at 8:59 AM, Uwe Schindler wrote:
> Somehow we have now both ArrayUtilTest and TestArrayUtil? I think they should
> be merged and TestArrayUtil as name preferred (as all other t
On Thu, Jan 28, 2010 at 5:20 AM, Uwe Schindler wrote:
> Can we fix NIOFSIndexInput to simply reopen the channel when the exception
> occurs?
The problem is that the file may have been deleted in the meantime.
This is quite a nasty behavior of FileChannel.
Mike
---
On Thu, Jan 28, 2010 at 5:59 AM, Uwe Schindler wrote:
> But if you keep the underlying RandomAccessFile open?
We could do that but... won't this consume 2 file descriptors?
Mike
-
To unsubscribe, e-mail: java-dev-unsubscr...@lu
>>
>> Possibly we could wait until Simon provides a testcase that fails.
>>
>> -
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: u...@thetaphi.de
>>
>>
>> > -Original Message-
&
On Thu, Jan 28, 2010 at 6:38 AM, Uwe Schindler wrote:
> So I checked the code of NIOFSIndexInput, my last comment was not really
> correct:
> NIOFSIndexInput extends SimpleFSIndexInput and that opens the RAF. In the
> ctor RAF.getChannel() is called. The RAF keeps open until the file is closed
. Or, 3) don't use NIOFSDir!
Mike
On Thu, Jan 28, 2010 at 7:29 AM, Simon Willnauer
wrote:
> On Thu, Jan 28, 2010 at 12:43 PM, Michael McCandless
> wrote:
>> On Thu, Jan 28, 2010 at 6:38 AM, Uwe Schindler wrote:
>>
>>> So I checked the code of NIOFSIndex
nd people could then report Lucene
> 3.X has slowed...
>
> On Thu, Jan 28, 2010 at 5:24 AM, Michael McCandless
> wrote:
>> Bummer.
>>
>> So the only viable workarounds are 1) don't use Thread.interrupt (nor,
>> things like Future.cancel, which in turn us
+1 to release. Thank you for volunteering :) We've got a number of
good bug fixes pending...
But: I think we should simply name it 3.0.1? If we skip 3.0.1 I think
it will cause confusion? We can state in the CHANGES that 2.9.2 has
same bug fixes as 3.0.1 and vice/versa?
Mike
On Sun, Feb 7, 2
On Tue, Feb 9, 2010 at 9:31 AM, Uwe Schindler wrote:
> The TestSpellChecker Executor problem seems to be a sun bug fixed in JDK
> 1.5.0_17 (awaitTermination problem:
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6576792 and related bugs).
> We updated lucene-zones's JVM for builds to the
:)
Also LUCENE-2111 is where flex development is "continuing"...
LUCENE-1458 got too big (loading the page was slow).
Mike
On Wed, Feb 10, 2010 at 11:17 AM, Robert Muir wrote:
> Hello, I think of it as "Mike Mccandless's Emacs workspace" (joking)
>
> it is a branch located here:
> https://svn.a
This is LUCENE-2118 -- it strikes every so often.
I think it's harmless (but really annoying). It looks like a corner
case in the merge policy where somehow a level is able to have 11
segments after merging thinks it's done.
Mike
On Thu, Feb 11, 2010 at 3:03 AM, Uwe Schindler wrote:
> Really c
Woops, right -- I just fixed. Thanks for catching it :)
Mike
On Fri, Feb 12, 2010 at 6:55 AM, Koji Sekiguchi wrote:
> Mike,
>
> You said "removeUnusedFiles" in CHANGES.txt, but isn't it
> "deleteUnusedFiles"?
>
> Koji
>
> --
> http://www.rondhuit.com/en/
>
>
> mikemcc...@apache.org wrote:
>>
>>
ert Muir wrote:
> On Fri, Nov 27, 2009 at 1:27 PM, Michael McCandless
> wrote:
>>
>> Also one thing I'd love to try is NOT forking the JVM for each test
>> (fork="no" in the junit task). I wonder how much time that'd buy...
>>
>
> it shaves
IndexWriter has a default infoStream, so the infoStream could be
non-null during init.
Mike
On Sat, Feb 13, 2010 at 3:16 PM, Shai Erera wrote:
> Hi
>
> IndexWriter.init() checks a couple of times whether infoStream != null in
> order to print informative messages ... init() is called only from t
nally! Are there any possibilities inside Eclipse/other-IDEs
>> to check this?
>>
>> Uwe
>>
>> -
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: u...@thetaphi.de
>>
>> > -Origina
This class makes me somewhat nervous, with the changes coming in flex,
because the extensions are no longer static but rather a function of
the particular codec you're using in the index. I've changed some of
the constants accordingly (on flex).
Still, I think it's OK to make it public (flex has
ode, I didn't find in places I checked that a
> reference to *just* the extension name is needed.
>
> And thanks for correcting me on the package-private and back-compat thing.
> In my mind it was already public :).
>
> Shai
>
> On Tue, Feb 23, 2010 at 1:16 PM, Michael Mc
Sigh... yes, better to turn these into Jira issues in general.
We could make the change under Version? (Change to true, starting in 3.1).
Or maybe not make the change. If set to true, we use pct deletion on
a segment to reduce its perceived size when selecting merges, which
generally causes seg
On Tue, Feb 23, 2010 at 6:46 AM, Shai Erera wrote:
> I don't think performance is the issue here, but rather correctness. Someone
> cannot just ask filename.endsWith(DELETION_EXT) as files like "file1del"
> would match as well. So whenever you make such check, you need to add ".".
> Again, not per
+1 to both adding doAfterFlush and making the two methods protected. Patch?
Mike
On Tue, Feb 23, 2010 at 6:55 AM, Shai Erera wrote:
> Hi,
>
> Can we add to IW a doBeforeFlush(), similar to doAfterFlush(), which will
> get called before flush actually happens (i.e., at the beginning of
> flush()
+1 to release. I used each version's binary release to build & search a
5M wikipedia index.
Search performance is the same for TermQuery with both releases, but
for PhraseQuery (at least the 3 simple 2-word phrases I tested) was
~9% faster (20.49 QPS -> 22.29 QPS). Not sure why... but it's movin
This sounds like a bug -- can you open an issue? Thanks!
Mike
On Wed, Feb 24, 2010 at 10:04 AM, Frank Wesemann
wrote:
> Hi,
> I am just getting my feet wet with the queryParser in contrib/queryparser.
> This new API is really a huge improvement.
> I am using it to convert Solr-Style input into
Possible approaches have been discussed on the list, fairly recently,
but I don't think there's active work against it...
Mike
On Thu, Feb 25, 2010 at 5:24 AM, Anshum wrote:
> Hi,
> I'd like to know do we have something for field level document updation
> planned for the near future? Something t
In thinking about & discussing with Robert how to allow Lucene to
support other scoring models, eg lnu.ltc, BM25, etc I think a
relatively contained set of changes can give us a solid step forward.
Something like this:
* Store additional per-doc stats in the index, eg in a custom
posting
This class is @lucene.experimental, so we are free to break it. +1 to
not "extends Vector".
I don't think we should change to @lucene.internal since the
thinking is apps outside Lucene should be able to introspect and see
segment structure in the index. Ie we made this API public so people
o
Seems OK I think?
Mike
On Sun, Feb 28, 2010 at 12:37 AM, Shai Erera wrote:
> Hi
>
> Do you think it's worth to make some of the isDeleted method impls final,
> like in ReadOnlySegmentReader and (maybe) DirectoryReader? I'm thinking the
> classes that are perceived as final could benefit from tha
wrote:
> What's ok? making the classes final or just the method declaration? If
> classes, besides ReadOnlySegmentReader, which other impls do you think can
> be made final (I'm not in front of the code)?
>
> On Sun, Feb 28, 2010 at 7:05 PM, Michael McCandless
> wrote:
>&g
>> > e.g. the collect methods in TFDC should be final and so on. But there is no
>> > requirement anymore. And Lucene 3.1 only runs with Java 5+, so who cares?
>> >
>> > -
>> > Uwe Schindler
>> > H.-H.-Meier-Allee 63, D-28213 Bremen
>> >
On Sun, Feb 28, 2010 at 1:38 PM, Marvin Humphrey wrote:
> On Fri, Feb 26, 2010 at 12:50:44PM -0500, Michael McCandless wrote:
>
>> * Store additional per-doc stats in the index, eg in a custom
>> posting list,
>
> Inline, as in a payload? Of course that can wo
wrote:
> In the analyzers case, I don't think its really door-shutting. if someone
> extends an Analyzer, its likely to just result in problems from the
> tokenStream/reusableTokenStream mess.
>
> On Wed, Mar 3, 2010 at 11:10 AM, Grant Ingersoll
> wrote:
>>
>
On Wed, Mar 3, 2010 at 11:10 AM, Grant Ingersoll wrote:
>
> On Mar 1, 2010, at 2:51 AM, Michael McCandless wrote:
>
>> Yeah in the case of DirectoryReader/MultiReader, I'd like for them to
>> be final, not for performance but for door-shutting (ie the same
>>
If Solr/Lucene dev were merged, and queryParser is it's own module,
this user could simply upgrade his queryParser JAR to get this fix.
Mike
-- Forwarded message --
From: Alexander S (JIRA)
Date: Thu, Mar 4, 2010 at 2:24 AM
Subject: (SOLR-355) Parsing mixed inclusive/exclusive r
On Tue, Mar 2, 2010 at 4:12 PM, Marvin Humphrey wrote:
> On Tue, Mar 02, 2010 at 05:55:44AM -0500, Michael McCandless wrote:
>> The problem is, these scoring models need the avg field length (in
>> tokens) across the entire index, to compute the norms.
>>
>> Ie, you
Currently you can't tell IW to use the pool (ie, pool is only enabled
if you use NRT readers). We should probably make this an option at
ctor time, for situations like this. (In fact, in followon
discussions about further improvements to NRT we've already discussed
having such an option to IW's c
OK I opened:
https://issues.apache.org/jira/browse/LUCENE-2297
Mike
On Fri, Mar 5, 2010 at 10:25 AM, Michael McCandless
wrote:
> Currently you can't tell IW to use the pool (ie, pool is only enabled
> if you use NRT readers). We should probably make this an option at
>
On Fri, Mar 5, 2010 at 3:56 PM, Mark Miller wrote:
> On 03/05/2010 03:43 PM, Michael McCandless (JIRA) wrote:
>> [
>> https://issues.apache.org/jira/browse/LUCENE-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842
On Fri, Mar 5, 2010 at 1:54 PM, Marvin Humphrey wrote:
> On Thu, Mar 04, 2010 at 12:23:38PM -0500, Michael McCandless wrote:
>> > In a multi-node search cluster, pre-calculating norms at index-time
>> > wouldn't work well without additional communication between nodes
On Mon, Mar 8, 2010 at 4:17 AM, wrote:
> Author: uschindler
> Date: Mon Mar 8 09:17:03 2010
> New Revision: 920240
>
> URL: http://svn.apache.org/viewvc?rev=920240&view=rev
> Log:
> Merge flex up to trunk rev 920237.
>
> This revision was left out, because it conflicted "heavy": 919060
> Message
On Sun, Mar 7, 2010 at 1:21 PM, Marvin Humphrey wrote:
> On Sat, Mar 06, 2010 at 05:07:18AM -0500, Michael McCandless wrote:
>> It won't encounter an unknown posting format. It's the codec. It
>> knows all posting formats by the time it sees it.
>
> OK, so you&
On Mon, Mar 8, 2010 at 2:07 PM, Steven A Rowe wrote:
> On 03/08/2010 at 1:57 PM, Steven A Rowe wrote:
>> On 03/08/2010 at 1:13 PM, Michael McCandless wrote:
>> > On Sun, Mar 7, 2010 at 1:21 PM, Marvin Humphrey
>> > wrote:
>> > > On Sat, Mar 06, 2010 at 05:
On Sun, Mar 7, 2010 at 11:43 AM, Marvin Humphrey wrote:
> On Sat, Mar 06, 2010 at 05:07:18AM -0500, Michael McCandless wrote:
>> > Fortunately, beaming field length data around is an easier problem than
>> > distributed IDF, because with rare exceptions, the number of fie
On Mon, Mar 8, 2010 at 9:47 PM, Marvin Humphrey wrote:
> On Mon, Mar 08, 2010 at 01:13:53PM -0500, Michael McCandless wrote:
>> I think we can actually do so w/o losing Lucene's loose typing if we
>> simply peeled out [say] a FieldType class that holds the settings you
>
On Tue, Mar 9, 2010 at 2:28 AM, Marvin Humphrey wrote:
> On Mon, Mar 08, 2010 at 02:23:47PM -0500, Michael McCandless wrote:
>> For a large index the stats will be stable after re-indexing only a
>> few more docs.
>
> Well, not if there's been huge churn on other no
On Tue, Mar 9, 2010 at 10:03 AM, Marvin Humphrey wrote:
> On Tue, Mar 09, 2010 at 05:06:08AM -0500, Michael McCandless wrote:
>> > For what it's worth, that's sort of the way KS used to work:
>> > Schema/FieldType
>> > information was stored entirely in
On Tue, Mar 9, 2010 at 2:11 PM, Marvin Humphrey wrote:
>> > I don't know that compressing the raw materials is going to work as well as
>> > compressing the final product. Early quantization errors get compounded
>> > when
>> > used in later calculations.
>>
>> I would not compress for starters.
On Tue, Mar 9, 2010 at 3:58 PM, Marvin Humphrey wrote:
> On Tue, Mar 09, 2010 at 01:18:12PM -0500, Michael McCandless wrote:
>>
>> >> You said "of course" before but... how in your proposal could one
>> >> store all stats for a given field during indexi
Welcome aboard Chris!
Mike
On Fri, Mar 12, 2010 at 9:17 AM, Mark Miller wrote:
> I am happy to announce the Lucene PMC has accepted Chris Male as a
> contrib committer!
>
> Chris has been making a lot of headway in cleaning up the spacial contrib
> lately,
> and hopefully now we can get more of
On Thu, Mar 11, 2010 at 12:35 PM, Marvin Humphrey
wrote:
> On Mon, Mar 08, 2010 at 02:10:35PM -0500, Michael McCandless wrote:
>
>> We ask it to give us a Codec.
>
> There's a conflict between the segment-wide role of the "Codec" class and its
> role as speci
On Fri, Mar 12, 2010 at 8:31 PM, Marvin Humphrey wrote:
> On Thu, Mar 11, 2010 at 05:59:03AM -0500, Michael McCandless wrote:
>> > So there would be polymorphism in the decoding phase while we're supplying
>> > information the Similarity object needs to make its similari
I like the proposed new semantics (throw FNFE if the file does not
exist), and the migration path (new method, deprecate old).
Mike
On Sat, Mar 13, 2010 at 7:46 AM, Shai Erera wrote:
> I think it falls under the semantics of dir.fileLength() and not the
> semantics of the implementation right? U
Thanks!
Mike
On Sat, Mar 13, 2010 at 9:10 AM, Shai Erera wrote:
> Ok, opened LUCENE-2316 to track this.
>
> Shai
>
> On Sat, Mar 13, 2010 at 3:49 PM, Michael McCandless
> wrote:
>>
>> I like the proposed new semantics (throw FNFE if the file does not
>>
+1
Mike
On Sun, Mar 14, 2010 at 11:53 AM, Grant Ingersoll wrote:
> Given the notion of "one project, one set of committers", I think we should
> do away with the notion of contrib committers for java-dev and just have
> everyone be committers. Practically speaking, this would make all existin
On Mon, Mar 15, 2010 at 12:03 AM, Marvin Humphrey
wrote:
> On Sat, Mar 13, 2010 at 06:41:26AM -0500, Michael McCandless wrote:
>
>> I still don't think similarity should have any bearing during indexing.
>
> Similarity has always, from day one, affected the contents of
The merge of Solr and Lucene dev is well underway... Lucene already
has a bunch of new committers... welcome aboard!
And overnight tons of work was done (and beer, espresso and tea,
depending on your timezone, consumed ;) and now we already
have a branch where Solr has been upgraded to Lucene's tr
On Tue, Mar 16, 2010 at 2:51 AM, Michael Busch wrote:
> On 3/16/10 12:43 AM, Simon Willnauer wrote:
>>
>> If my impression should be wrong or if I miss something please ignore
>> the last paragraph.
>
> I feel exactly like you, Simon. I don't understand the rush. Also, we're
> in review-and-comm
I think it like the 1st option best (lucene moves as subdir to solr's
current trunk SVN path), but I don't feel strongly.
This'd mean one could simply checkout lucene alone and do everything
you can do today.
But if you check out solr, you also get a full checkout of lucene, and
solr's build.xml
+1, this looks great!
Mike
On Tue, Mar 16, 2010 at 1:52 PM, Andi Vajda wrote:
>
> On Mar 16, 2010, at 11:47, Steven A Rowe wrote:
>
>> On 03/16/2010 at 6:06 AM, Michael McCandless wrote:
>>>
>>> Does anyone know how other projects fold in IRC...?
>>
>
On Tue, Mar 16, 2010 at 2:17 PM, Michael Busch wrote:
> But at the same time can we make sure that the decisions that are made on
> IRC are still being described in a jira issue?
+1
Any time something is discussed on IRC, it must be summarized on the
lists or in an issue, with the details based
The primary concern seems to be ensuring that, once we
merge svn, one can still checkout & build & run tests/etc for
Lucene alone.
If we move lucene under Solr's existing svn path, ie:
/solr/trunk/lucene
and then fixup solr's build files to go and compile sources from the
lucene dir, run tests
Dev is now merged with Solr and Lucene -- that has already passed. If
that will scare customers away, that's a risk we take -- the benefits
of merged dev outweigh that, in my opinion.
The incremental risk that the details of our svn URLs will scare
people away seems negligible.
And we can always
But it's actually the reverse? Solr depends on Lucene but not vice/versa.
(If instead I proposed making Solr a subdir of Lucene then I'd agree)
So... if you checkout only lucene, you can cd there and do all you do
today with Lucene ("ant test", "ant dist", "svn diff", etc.).
If you checkout
+1
I like this proposal!
I agree we should not preclude the future (modules), let's just not
hold up dev today until we solve it.
I agree your side by side solution would allow for us to later factor
up modules (eg analyzers).
Mike
On Tue, Mar 16, 2010 at 5:47 PM, Michael McCandless
Duh -- I meant to reply to Hoss' proposal, below:
On Tue, Mar 16, 2010 at 5:55 PM, Michael McCandless
wrote:
> +1
>
> I like this proposal!
>
> I agree we should not preclude the future (modules), let's just not
> hold up dev today until we solve it.
>
> I agre
You're right!
Really we should delete from sync'd when we delete the files. We need
to tie into IndexFileDeleter for that, maybe moving this set into
there.
Though in practice the amount of actual RAM used should rarely be an
issue? But we should fix it...
Can you open an issue?
Mike
On Wed,
Thanks!
Mike
On Wed, Mar 17, 2010 at 3:16 PM, Gregor Kaczor wrote:
> followup in
>
> https://issues.apache.org/jira/browse/LUCENE-2328
>
>
> Original-Nachricht
>> Datum: Wed, 17 Mar 2010 14:30:25 -0500
>> Von: Michael McCandless
>> An: j
On Mon, Mar 15, 2010 at 7:49 PM, Marvin Humphrey wrote:
> On Mon, Mar 15, 2010 at 05:28:33AM -0500, Michael McCandless wrote:
>> I mean specifically one should not have to commit to the precise
>> scoring model they will use for a given field, when they index that
>> field.
Unfortunately, highlighter (and I think also fast vector highlighter)
are able to return a set of fragments which do not match the
query (eg, they only show one of the two required terms).
I really don't like that they do this.
Ideally (to me) the entire excerpt (ie, all fragments appended
togeth
All tests pass for me :)
Mike
On Thu, Mar 18, 2010 at 12:27 PM, Mark Miller wrote:
> Alight, so we have implemented Hoss' suggestion here on the lucene/solr
> merged dev branch at lucene/solr/branches/newtrunk.
>
> Feel free to check it out and give some feedback.
>
> We also roughly have Solr r
If you build the ords per-segment, how do you compare results across segments?
Ie, in the non-Collator case, Lucene stores ords but must also store
the actual String so that the FieldComparator is able to compare
results across segments
Mike
On Fri, Mar 19, 2010 at 10:06 AM, Toke Eskildsen
wro
On Fri, Mar 19, 2010 at 12:46 PM, Toke Eskildsen
wrote:
> However, it is not set in stone that we will shift to using Exposed or
> similar: As many others we're pursuing real-time indexing and while Exposed
> sits at the segment-level and thus works well for re-open, big segment
> changes sti
aphi.de
> eMail: u...@thetaphi.de
>
>> -Original Message-
>> From: Michael McCandless (JIRA) [mailto:j...@apache.org]
>> Sent: Monday, March 22, 2010 11:22 AM
>> To: java-dev@lucene.apache.org
>> Subject: [jira] Resolved: (LUCENE-2297) IndexWriter sho
t you were
> suggesting?
> Cheers
> Chris
>
> On Mon, Mar 22, 2010 at 11:37 AM, Michael McCandless
> wrote:
>>
>> I think we should.
>>
>> It (newtrunk) was created to test Hoss's side-by-sdie proposal, and
>> that approach looks to be working very we
+1, let's do this now.
Mike
On Mon, Mar 22, 2010 at 11:44 AM, Grant Ingersoll wrote:
> Shall we merge the dev mailing lists? This should reduce the cross-posting
> and can be completely automated (other than you may have to update your
> client-side filters) and was part of the plan to merge
+1
Mike
On Mon, Mar 22, 2010 at 11:53 AM, Ryan McKinley wrote:
> why not just "d...@lucene.apache.org"?
>
>
>
> On Mon, Mar 22, 2010 at 11:44 AM, Grant Ingersoll wrote:
>> Shall we merge the dev mailing lists? This should reduce the cross-posting
>> and can be completely automated (other than
You can create your own Similarity implementation?
Mike
On Mon, Mar 22, 2010 at 12:32 PM, zsl wrote:
>
> Hi all!
>
> Im developing an aplication that uses Lucene and I´m trying to set the IDF
> manually before I do query.
> In other words ¿Is there a way to do a search query with an IDF value
>
You can implement just the "out of order" collector, since it subsumes
the in-order case, and all will work fine.
However, if the collector can save CPU when docs are known to arrive
in-order (not all collectors can) it'd be good to make a separate
in-order one as well.
Mike
On Tue, Mar 23, 2010
OK put it up! Sounds good :)
Mike
On Tue, Mar 23, 2010 at 1:54 PM, Grant Ingersoll wrote:
>
> On Mar 23, 2010, at 1:20 PM, Michael McCandless wrote:
>
>> You can implement just the "out of order" collector, since it subsumes
>> the in-order case, and all will w
Ahh, very nice!
Mike
On Wed, Mar 24, 2010 at 11:39 AM, Michael McCandless (JIRA)
wrote:
>
> [
> https://issues.apache.org/jira/browse/LUCENE-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12849228#action_12849228
> ]
>
&
On Mon, Mar 22, 2010 at 12:45 PM, Marvin Humphrey
wrote:
> On Thu, Mar 18, 2010 at 05:16:23AM -0500, Michael McCandless wrote:
>> Also, will Lucy store the original stats?
>
> These?
>
> * Total number of tokens in the field.
> * Number of unique terms in th
I'm happy to announce that the PMC has accepted Shai Erera as
Lucene/Solr committer!
Welcome aboard Shai,
Mike
PS: it's custom to introduce yourself with a brief bio :)
-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apac
I think we should consolidate all query parsers as a module? And all
queries (contrib/queries + oal/search/*Query)?
I don't think we should leave "basic X" inside core... I think there
should be one place to get the different Xs lucene offers (where X is
a query parser, queries, analyzers, etc.).
I'm merging the conflicts now... it turned out to cause a number of
conflicts because in flex I changed how DW stores the terms in RAM, to
first prefix-code each term's length as vInt (most often 1 byte, but
in messed up cases 2 bytes), and then then the term's characters as
UTF8 bytes. This cause
Right... in fact as long as we land flex before 3.1 releases then this
is not a back-compat break (but we should heavily advertise the change
in semantics) ;)
Ie Directory.copy used to filter for only index files, but
Directory.copyTo copies everything so you must provide your own list
if this mat
Ugh, actually, it is still a back-compat break :(
Because Directory.copy just forwards to copyTo.
I'll advertise in CHANGES for flex.
Mike
On Sat, Mar 27, 2010 at 4:41 PM, Michael McCandless
wrote:
> Right... in fact as long as we land flex before 3.1 releases then this
> is not a
ll advertise in CHANGES for flex.
>>
>> Mike
>>
>> On Sat, Mar 27, 2010 at 4:41 PM, Michael McCandless
>> wrote:
>>> Right... in fact as long as we land flex before 3.1 releases then this
>>> is not a back-compat break (but we should heavily adv
I agree this is a long overdue feature... we need to get it into
Lucene somehow.
I like the Layers analogy... I think that will work well with Lucene's
transactional semantics, ie a prior commit point would continue to see
the index before the updates but new commit points would see the
updates.
On Thu, Mar 25, 2010 at 1:20 PM, Marvin Humphrey wrote:
> On Thu, Mar 25, 2010 at 06:24:34AM -0400, Michael McCandless wrote:
>> >> Also, will Lucy store the original stats?
>> >
>> > These?
>> >
>> > * Total number of tokens in the
I think that's a good idea for Lucy.
Mike
On Fri, Mar 26, 2010 at 10:58 AM, Marvin Humphrey
wrote:
> On Thu, Mar 25, 2010 at 06:24:34AM -0400, Michael McCandless wrote:
>> > Maybe aggressive automatic data-reduction makes more sense in the context
>> > of
>>
I think the time has finally come! Pending one issue (LUCENE-2354 --
Uwe), I think flex is ready to land I think the other issues with Fix
Version = Flex Branch can be moved to 3.1 after we land.
We still use the pre-flex APIs in a number of places... I think this
is actually good (so we cont
Welcome Uwe!!
Mike
On Thu, Apr 1, 2010 at 7:05 AM, Grant Ingersoll wrote:
> I'm pleased to announce that the Lucene PMC has voted to add Uwe Schindler to
> the PMC. Uwe has been doing a lot of work in Lucene and Solr, including
> several of the last releases in Lucene.
>
> Please join me in e
On Sat, Apr 3, 2010 at 1:25 AM, Babak Farhang wrote:
>> I think they get merged in by the merger, ideally in the background.
>
> That sounds sensible. (In other words, we wont concern ourselves with
> roll backs--something possible while a "layer" is still around.)
Actually roll backs would still
The flex API isolates fields, ie you get a TermsEnum for a given field
and it enums only the term's text (as a BytesRef).
Mike
On Mon, Apr 5, 2010 at 7:22 PM, Earwin Burrfoot wrote:
> A random thought from some of the earlier discussions.
>
> Had anybody used the fact that Lucene Term space is c
have to be ordered if we
> introduce updates? Or does the onus of maintaining order fall on the
> application?
>
> -Babak
>
> On Sat, Apr 3, 2010 at 3:28 AM, Michael McCandless
> wrote:
>> On Sat, Apr 3, 2010 at 1:25 AM, Babak Farhang wrote:
>>>> I think the
On Tue, Apr 6, 2010 at 10:11 AM, Earwin Burrfoot wrote:
> So, I want to pump my IndexWriter hard and fast with documents.
Nice.
> Removing fsync from FSDirectory helps. But for that I pay with possibility of
> index corruption, not only if my node suddenly loses
> power/kernelpanics, but also i
On Tue, Apr 6, 2010 at 7:26 PM, Earwin Burrfoot wrote:
>> Running out of disk space with fsync disabled won't lead to corruption.
>> Even kill -9 the JRE process with fsync disabled won't corrupt.
>> In these cases index just falls back to last successful commit.
>>
>> It's "only" power loss / OS
Yes +1 to that -- thanks Uwe!!
And thanks for the many other people who helped out on flex. It's a
big and exciting improvement :)
Mike
On Wed, Apr 7, 2010 at 4:11 PM, Michael Busch wrote:
> Uwe, thanks for doing all the svn work! Was a smooth transition!
>
> Michael
>
> On 4/6/10 12:27 PM,
On Wed, Apr 7, 2010 at 3:27 PM, Earwin Burrfoot wrote:
>> No, this doesn't make sense. The OS detects a disk full on accepting
>> the write into the write cache, not [later] on flushing the write
>> cache to disk. If the OS accepts the write, then disk is not full (ie
>> flushing the cache will
+1
I don't think bw needs to be kept -- contrib/benchmark is allowed to change.
Mike
On Thu, Apr 8, 2010 at 5:44 AM, Shai Erera wrote:
> Hi
>
> I've noticed benchmark has a NoDeletionPolicy class and I was wondering if
> we can move it to core. I might want to use it for the parallel index stuf
Actually Toke opened a new issue (LUCENE-2369) for the new approach to
Locale-based sorting... I think we should leave the existing issue as
the single-segment optimization (it's a separate issue).
Mike
On Thu, Apr 8, 2010 at 6:06 PM, Chris Hostetter
wrote:
>
> : > Is it possible to change it? I
On Thu, Apr 8, 2010 at 6:21 PM, Earwin Burrfoot wrote:
>> But, IW doesn't let you "hold on to" checkpoints... only to commits.
>>
>> Ie SnapshotDP will only "see" actual commit/close calls, not
>> intermediate checkpoints like a random segment merge completing, a
>> flush happening, etc.
>>
>> Or.
1 - 100 of 6433 matches
Mail list logo