Re: [DISCUSS] Do away with Contrib Committers and make core committers

2010-03-14 Thread Michael Busch
+1 Michael On 3/14/10 9:53 AM, Grant Ingersoll wrote: Given the notion of "one project, one set of committers", I think we should do away with the notion of contrib committers for java-dev and just have everyone be committers. Practically speaking, this would make all existing contrib commi

Re: Welcome new committers!

2010-03-15 Thread Michael Busch
Welcome guys! :) Sounds really like some great progress in such a short time! Michael On 3/15/10 8:25 AM, Michael McCandless wrote: The merge of Solr and Lucene dev is well underway... Lucene already has a bunch of new committers... welcome aboard! And overnight tons of work was done (and be

Re: lucene and solr trunk

2010-03-16 Thread Michael Busch
I completely agree with Uwe and Hoss. These questions need to be addressed first. I still want to be able to only checkout Lucene code and run the Lucene build independently from Solr. And Lucene needs to be able to release without Solr and the branching/tagging needs to support that as Uwe

Re: lucene and solr trunk

2010-03-16 Thread Michael Busch
On 3/16/10 12:43 AM, Simon Willnauer wrote: If my impression should be wrong or if I miss something please ignore the last paragraph. I feel exactly like you, Simon. I don't understand the rush. Also, we're in review-and-commit process, not commit-and-review. Changes have to be propose

Re: #lucene IRC log [was: RE: lucene and solr trunk]

2010-03-16 Thread Michael Busch
It be very cool to have a searchable archive for the IRC discussions, so +1. But at the same time can we make sure that the decisions that are made on IRC are still being described in a jira issue? I don't mean that people should repeat brainstorming, but if a discussion leads to opening a Ji

Re: lucene and solr trunk

2010-03-16 Thread Michael Busch
What about tagging and branching? When we cut a Lucene release we also tag Solr, even though it's not being released? Michael On 3/16/10 3:47 PM, Michael McCandless wrote: But it's actually the reverse? Solr depends on Lucene but not vice/versa. (If instead I proposed making Solr a subdir

Re: Mailing List merge

2010-03-22 Thread Michael Busch
+1 Michael On Mar 22, 2010, at 8:55 AM, Michael McCandless > wrote: +1 Mike On Mon, Mar 22, 2010 at 11:53 AM, Ryan McKinley wrote: why not just "d...@lucene.apache.org"? On Mon, Mar 22, 2010 at 11:44 AM, Grant Ingersoll > wrote: Shall we merge the dev mailing lists? This should red

Running the Solr/Lucene tests failed

2010-03-23 Thread Michael Busch
Hi all, I wanted to commit LUCENE-2329. I just checked out the new combined trunk https://svn.apache.org/repos/asf/lucene/dev/trunk and ran "ant test". After 20 mins the build failed on the unmodified code (see below). I hadn't applied my patch yet. What's the status of the combined trunk

Re: Running the Solr/Lucene tests failed

2010-03-23 Thread Michael Busch
On 3/23/10 1:07 PM, Robert Muir wrote: Maybe, the Solr test TestLBHttpSolrServer failed for me randomly before this parallelization though, and still does. In general the jetty tests have caused me some grief. But its also equally likely i broke it for you somehow... Michael, can you try runnin

Re: Running the Solr/Lucene tests failed

2010-03-23 Thread Michael Busch
l see this problem running the tests sequentially too. Test org.apache.solr.client.solrj.embedded.JettyWebappTest FAILED On Tue, Mar 23, 2010 at 4:15 PM, Michael Busch wrote: Sorry for the lack of details. Thought I had just not done an obvious step. Attached is the output from the Solr

Re: Running the Solr/Lucene tests failed

2010-03-23 Thread Michael Busch
On 3/23/10 2:12 PM, Yonik Seeley wrote: On Tue, Mar 23, 2010 at 5:07 PM, Michael Busch wrote: OK I reran the tests sequentially with my LUCENE-2329 patch applied. The same test failed again: [junit] Test org.apache.solr.client.solrj.embedded.JettyWebappTest FAILED Everything else looks

Re: Welcome Shai Erera as Lucene/Solr committer

2010-03-26 Thread Michael Busch
Welcome Shai! On 3/26/10 8:04 AM, Shai Erera wrote: I'll start with Parallel Index (thanks Michael B. for introducing it!) and continue to more exciting things. Awesome! I'll help where I can. I'm still excited about all the cool stuff you can do with parallel indexing, time is my only

Re: Modules

2010-03-26 Thread Michael Busch
n 3/26/10 7:16 AM, Grant Ingersoll wrote: So, should we start thinking about a Modules dir at the same level as Lucene/Solr where shared, non-core code lives? For starters, I think spatial and analyzers could go there. Proposal: lucene/ solr/ modules/ analyzers spatial others (hi

Re: svn commit: r928246 [1/6] - in /lucene/java/branches/flex_1458: ./ backwards/src/ backwards/src/java/org/apache/lucene/search/ backwards/src/test/org/apache/lucene/analysis/ backwards/src/test/org

2010-03-27 Thread Michael Busch
upto trunk rev 928243 The following revision of Michael Busch was left out (I do not know how this affects flex): 926791 (buschmi, LUCENE-2329: Use parallel arrays instead of PostingList objects in TermsHash*) Also I added a noncommit to the Directory.copyTo() method, as the IndexFileNameFi

Re: Commit freeze in flex branch

2010-04-07 Thread Michael Busch
Uwe, thanks for doing all the svn work! Was a smooth transition! Michael On 4/6/10 12:27 PM, Uwe Schindler wrote: The freeze is over, we merged successfully. If you had a flex branch checked out: svn switch https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene Uwe - Uwe Schindler

Re: Proposal about Version API "relaxation"

2010-04-13 Thread Michael Busch
I agree with Uwe. We shouldn't use non-final public statics. Thinking out loud: Could IndexWriter/IndexReader propagate the Version to the downstream classes (e.g. IndexWriter to Analyzers, IndexReader to queries) if not previously explicitly set? E.g. an IndexWriter calls setVersion on an

Flexible index format / Payloads Cont'd

2006-06-29 Thread Michael Busch
think about my suggestions. If people like this approach, then I can add the information to the Wiki planning page and start working on it. Best Regards, Michael Busch - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Flexible index format / Payloads Cont'd

2006-06-30 Thread Michael Busch
so that people can start developing their own stuff. Later, if people submit good solutions, those might be good candidates for contrib. Marvin Humphrey Rectangular Research http://www.rectangular.com/ Regards, Michael Busch - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Flexible index format / Payloads Cont'd

2006-07-05 Thread Michael Busch
Doug Cutting wrote: Marvin Humphrey wrote: IMO, this should wait. It's going to be freakishly difficult to get this stuff to work and maintain the commitments that Doug has laid out for backwards compatibility. Perhaps we can implement an all-new index format, in a new package. An implemen

Re: [jira] Updated: (LUCENE-624) Segment size limit for compound files

2006-07-27 Thread Michael Busch
relatively low. If I find some time I will run performance experiments to get some numbers. Michael On Jul 26, 2006, at 5:18 PM, Michael Busch (JIRA) wrote: [ http://issues.apache.org/jira/browse/LUCENE-624?page=all ] Michael Busch updated LUCENE-624

Re: Dynamically varying maxBufferedDocs

2006-11-09 Thread Michael Busch
I had the same problem with large documents causing memory problems. I solved this problem by introducing a new setting in IndexWriter setMaxBufferSize(long). Now a merge is either triggered when bufferedDocs==maxBufferedDocs *or* the size of the bufferedDocs >= maxBufferSize. I made these chan

Re: Dynamically varying maxBufferedDocs

2006-11-09 Thread Michael Busch
This sounds good. Michael, I'd love to see your patch, Chuck Ok, I'll probably need a few days before I can submit it (have to code unit tests and check if it compiles with the current head), because I'm quite busy with other stuff right now. But you will get it soon :-)

Re: [jira] Commented: (LUCENE-721) Code coverage reports

2006-11-21 Thread Michael Busch
Chris Hostetter wrote: : Nice. I think we can't include EMMA jars int he repository, though, so : you'll want to add the ability to download the Jar on the fly, just like : Grant did it for the benchmark stuff. that's not strictly neccessary is it? ... coverage reports could just be an optional

Re: [jira] Commented: (LUCENE-721) Code coverage reports

2006-11-22 Thread Michael Busch
Chris Hostetter wrote: To throw another twist onto things, it would appear that the ASF has a License for Clover 1.3.2 donated by Cenqua that Committers have access to (see committers/donated-licenses/clover in SVN) ... it's not clear to me if that License would allow for auto generated reports o

Re: [jira] Resolved: (LUCENE-709) [PATCH] Enable application-level management of IndexWriter.ramDirectory size

2006-11-22 Thread Michael Busch
Ning Li wrote: I was away so I'm catching up. If this (occasional large documents consume too much memory) happens to a few applications, should it be solved in IndexWriter? A possible design could be: First, in addDocument(), compute the byte size of a ram segment after the ram segment is crea

Re: [jira] Updated: (LUCENE-721) Code coverage reports

2006-11-28 Thread Michael Busch
Chris Hostetter wrote: : Here it is, Grant. This new patch uses Clover to generate code coverage : reports. Simply add clover.jar to the ant classpath, do a "clean" and : run the target "test". During compiling Clover will automatically : instrument all classes under src/java. haven't had a chan

Re: Attached proposed modifications to Lucene 2.0 to support Field.Store.Encrypted

2006-12-05 Thread Michael Busch
negrinv wrote: there is a third way Doug, and it's for me to stop trying to be polite by answering all the questions that I am being asked, then nobody will get upset by my replies. If the decision is for no encryption at field level, I accept it, but I don't believe it should be externalised. P

Re: IBM OmniFind Yahoo! Edition

2006-12-14 Thread Michael Busch
Thank you Doug and Andreas!! A year ago I didn't know anything about Lucene and, in general, hadn't much experience in open source. I have to say that it was a lot of fun to use Lucene and especially to work with the community. It is impressive how responsive the developers and committers are

Payloads

2006-12-20 Thread Michael Busch
Hi all, currently it is not possible to add generic payloads to a posting list. However, this feature would be useful for various use cases. Some examples: - XML search to index XML documents and allow structured search (e.g. XPath) it is neccessary to store the depths of the terms - part-of

Re: Payloads

2006-12-20 Thread Michael Busch
Nicolas Lalevée wrote: Le Mercredi 20 Décembre 2006 15:31, Grant Ingersoll a écrit : Hi Michael, Have a look at https://issues.apache.org/jira/browse/LUCENE-662 I am planning on starting on this soon (I know, I have been saying that for a while, but I really am.) At any rate, another set o

Re: Payloads

2006-12-20 Thread Michael Busch
Doug Cutting wrote: Michael, This sounds like very good work. The back-compatibility of this approach is great. But we should also consider this in the broader context of index-format flexibility. Three general approaches have been proposed. They are not exclusive. 1. Make the index form

Re: Payloads

2006-12-21 Thread Michael Busch
Doug Cutting wrote: A reason not to commit something like this now would be if it complicates the effort to make the format extensible. Each index feature we add now will require back-compatibility in the future, and we should be hesitant to add features that might be difficult to support i

Re: Payloads

2006-12-22 Thread Michael Busch
Nicolas Lalevée wrote: I have just looked at it. It looks great :) Thanks! :-) But I still doesn't understand why a new entry in the fieldinfo is needed. The entry is not really *needed*, but I use it for backwards-compatibility and as an optimization for fields that don't have any

Re: Payloads

2007-01-18 Thread Michael Busch
Doug, sorry for the late response. I was on vacation after New Year's... oh btw. Happy New Year to everyone! :-) Doug Cutting wrote: Michael Busch wrote: Yes I could introduce a new class called e.g. PayloadToken that extends Token (good that it is not final anymore). Not sure

Re: Payloads

2007-01-18 Thread Michael Busch
Nadav Har'El wrote: Hi Michael, For some uses (e.g., faceted search), one wants to add a payload to each document, not per position for some text field. In the faceted search example, we could use payloads to encode the list of facets that each document belongs to. For this, with the old API, y

Re: Payloads

2007-01-18 Thread Michael Busch
Grant Ingersoll wrote: Just to put in two cents: the Flexible Indexing thread has also talked about the notion of being able to store arbitrary data at: token, field, doc and Index level. -Grant Yes I agree that this should be the long-term goal. The payload feature is just a first step in

Re: Payloads

2007-01-18 Thread Michael Busch
Nadav Har'El wrote: On Thu, Jan 18, 2007, Michael Busch wrote about "Re: Payloads": As you pointed out it is still possible to have per-doc payloads. You need an analyzer which adds just one Token with payload to a specific field for each doc. I understand that this code woul

Re: Payloads

2007-01-19 Thread Michael Busch
Marvin Humphrey wrote: On Jan 18, 2007, at 8:31 AM, Michael Busch wrote: I think it makes sense to add new functions incrementally, as long as we try to only extend the API in a way, so that it is compatible with the long-term goal, as Doug suggested already. After the payload patch is

Re: Payloads

2007-01-19 Thread Michael Busch
Grant Ingersoll wrote: Couldn't agree more. This is good progress. I like the payloads patch, but I would like to see the lazy prox stream (Lucene 761) stuff done (or at least details given on it) so that we can hook this into Similarity so that it can be hooked into scoring. For 761 and th

Re: Lucene 2.1, soon

2007-02-01 Thread Michael Busch
Michael McCandless wrote: I plan on committing this one today. Once that's in I think we can and should get the release process going (Yonik had graciously volunteered to be the release manager)? +1 for starting the release process. Especially the big new features "lazy field loading", "lock

Re: Welcome Michael Busch

2007-02-02 Thread Michael Busch
Doug Cutting wrote: The Lucene PMC has voted to add Michael Busch as a Lucene committer. Welcome, Michael! Doug Thanks everyone for the nice words! Of course I want to keep the tradition alive, so here follows my introduction :-) I am from Germany (more exactly from the Sauerland :-) ). I

Problem with updating the website

2007-02-02 Thread Michael Busch
Hi, I just added myself to the "Who we are" page, regenerated it and committed the changes. Now I tried to update the website by doing: ssh people.apache.org cd /www/lucene.apache.org/java/docs svn up It fails with the following message: svn: Can't open file 'images/.svn/lock': Permission

Re: Problem with updating the website

2007-02-05 Thread Michael Busch
Chris Hostetter wrote: : I just added myself to the "Who we are" page, regenerated it and : committed the changes. Now I tried to update the website by doing: : ssh people.apache.org : cd /www/lucene.apache.org/java/docs : svn up just to clarify: is that what you tried because you saw it m

Re: [VOTE] release Lucene 2.1

2007-02-14 Thread Michael Busch
+1 Michael Yonik Seeley wrote: Release artifacts for review are at http://people.apache.org/~yonik/staging_area/lucene/ Please vote to officially release these packages as Lucene 2.1. -Yonik - To unsubscribe, e-mail: [EMAIL

Re: Welcome Doron Cohen!

2007-02-15 Thread Michael Busch
Yonik Seeley wrote: I'm pleased to announce that the Lucene PMC has voted to make Doron Cohen a Lucene committer. Congrats, and welcome aboard, Doron! -Yonik Awesome news, Doron! Welcome aboard. Congratulations, Michael -

Flexible indexing (was: Re: [jira] Commented: (LUCENE-755) Payloads)

2007-03-10 Thread Michael Busch
Key: LUCENE-755 URL: https://issues.apache.org/jira/browse/LUCENE-755 Project: Lucene - Java Issue Type: New Feature Components: Index Reporter: Michael Busch Assigned To: Michael Busch Attachments: payload.patch, payloads.p

Re: Flexible indexing

2007-03-11 Thread Michael Busch
Hi Grant, I certainly agree that it would be great if we could make some progress and commit the payloads patch soon. I think it is quite independent from FI. FI will introduce different posting formats (see Wiki: http://wiki.apache.org/lucene-java/FlexibleIndexing). Payloads will be part of

Re: [jira] Updated: (LUCENE-755) Payloads

2007-03-11 Thread Michael Busch
Grant Ingersoll wrote: Cool. I will try and take a look at it tomorrow. Since we have the lazy SegTermPos thing in now, we should be able to integrate this into scoring via the Similarity and merge TermDocs and TermPositions like you suggested. If I can get the Scoring piece in and people a

Re: Flexible indexing

2007-03-11 Thread Michael Busch
Grant Ingersoll wrote: In regard of FI and 662 however I really believe we should split it up and plan ahead (in a way I mentioned already), so that we have more isolated patches. It is really great that we have 662 already (Nicolas, thank you so much for your hard work, I hope you'll keep w

Re: [jira] Updated: (LUCENE-755) Payloads

2007-03-12 Thread Michael Busch
Grant Ingersoll wrote: I haven't looked at your latest patch yet, so this is just guesswork, but was thinking in TermScorer, around line 75 or so, we could add: score *= similarity.scorePayload(payloadBuffer); TermScorer currently doesn't iterate over the positions. It uses a buffer to load 3

Re: Flexible indexing

2007-03-12 Thread Michael Busch
Marvin Humphrey wrote: On Mar 10, 2007, at 3:27 PM, Michael Busch wrote: I'm going to respond to this over several mails (: and possibly days :) because there's an awful lot here, and I've already implemented a lot of it in KS. We should also make this public, so that users

Re: Flexible indexing

2007-03-12 Thread Michael Busch
Marvin Humphrey wrote: On Mar 12, 2007, at 2:11 PM, Michael Busch wrote: I think our best option here is to have a closed XML file for the index format/configuration (something like you sent in your other mail) plus a binary file for custom index-level metadata like Grant suggested. Why

Re: Flexible indexing

2007-03-13 Thread Michael Busch
Marvin Humphrey wrote: It uses global field semantics, which Hoss won't be happy about. ;) However, I'm grateful to Hoss for past critiques, as they've helped me to refine and improve how Schema works. For instance, as of KS 0.20_02 you can introduce new field_name => FieldSpec association

Re: svn commit: r518529 - /lucene/java/trunk/src/java/org/apache/lucene/index/SegmentInfos.java

2007-03-15 Thread Michael Busch
[EMAIL PROTECTED] wrote: Author: doronc Date: Thu Mar 15 02:08:07 2007 New Revision: 518529 URL: http://svn.apache.org/viewvc?view=rev&rev=518529 Log: maintain most recent file format in a single line in the code. (this is less bug prone.) Cool, Doron. I actually had a bug in my local, exp

Re: Build failed in Hudson: Lucene-trunk #692

2009-01-01 Thread Michael Busch
On 1/1/09 6:28 AM, Michael McCandless wrote: I think the pom.xml.template under contrib/spatial is broken (looks like a copy from contrib/instantiated), which is then causing dist-maven task to fail with this error: Error deploying artifact: File /export/home/hudson/hudson-slave/workspace

Re: [jira] Updated: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-20 Thread Michael Busch
Great, thanks Mark! -Michael On 1/19/09 3:22 PM, Mark Miller (JIRA) wrote: [ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1483: Description: This i

2.4.1 release?

2009-02-18 Thread Michael Busch
Hi all, 2.4.0 is out since October and since then two quite serious bugs were fixed: - LUCENE-1452: Binary fields content lost after segment merging - LUCENE-1474: Incorrect SegmentInfo.delCount when IndexReader.flush() is used I think we should make a 2.4.1 release? -Michael ---

Re: 2.4.1 release?

2009-02-18 Thread Michael Busch
+1. I'll try to work on my 2.9 patches this weekend. Mike On Feb 18, 2009, at 5:07 AM, Michael Busch wrote: Hi all, 2.4.0 is out since October and since then two quite serious bugs were fixed: - LUCENE-1452: Binary fields content lost after segment merging - LUCENE-1474: Incorrect

segments.gen file

2009-02-23 Thread Michael Busch
Hi, with LUCENE-1044 we sync the FS after writing files. Do we still need the segments.gen file to handle stale caches correctly? -Michael - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional comman

Re: segments.gen file

2009-02-26 Thread Michael Busch
o the segments file. -Michael Mike Michael Busch wrote: Hi, with LUCENE-1044 we sync the FS after writing files. Do we still need the segments.gen file to handle stale caches correctly? -Michael - To unsubscribe, e-mail: jav

Re: segments.gen file

2009-02-27 Thread Michael Busch
On 2/26/09 1:50 PM, Michael McCandless wrote: Michael Busch wrote: On 2/24/09 4:05 AM, Michael McCandless wrote: I believe we still need this, for remote filesystems (like NFS) that have inconsistent client-side caching. The fsync() ensures the local IO system has moved the bytes/file

Re: [VOTE] Release Lucene 2.4.1

2009-03-03 Thread Michael Busch
+1 I tested successfully on mac os: --> lucene-2.4.1-src.tar.gz: - ant dist dist-src - ant test --> lucene-2.4.1.tar.gz - ant clean - ant compile-demo - ant demo-index-html - ant demo-index-text - ant demo-search-html - ant demo-search-text - ant jar-demo - ant war-demo -Michael On 3/2/09

Re: [VOTE] Release Lucene 2.4.1

2009-03-04 Thread Michael Busch
On 3/4/09 5:28 AM, Michael McCandless wrote: Grant Ingersoll wrote: On Mar 4, 2009, at 8:05 AM, Michael McCandless wrote: lucene-2.4.1-src.tar.gz --> ant test I'm not sure how this could ever pass. The lib directory is not present, so neither is JUnit, so the tests do not compile. I'm

Re: [VOTE] Release 2.4.1, take 2

2009-03-06 Thread Michael Busch
+1 -Michael On 3/4/09 2:27 PM, Michael McCandless wrote: This is a new vote! I've re-built the release artifacts (to include LUCENE-1552 fix), derived from revision 750176 on the 2.4 branch. Here are the changes: http://people.apache.org/~mikemccand/staging-area/lucene2.4.1rc2/changes/Cha

New flexible query parser

2009-03-16 Thread Michael Busch
Hello, in my team at IBM we have used a different query parser than Lucene's in our products for quite a while. Recently we spent a significant amount of time in refactoring the code and designing a very generic architecture, so that this query parser can be easily used for different products wit

Re: New flexible query parser

2009-03-16 Thread Michael Busch
I personally think this is pretty solid code with good unit tests and documentation. So I'd also be fine with adding it to the core. - Mark Michael Busch wrote: Hello, in my team at IBM we have used a different query parser than Lucene's in our products for quite a while. Recently we s

Re: New flexible query parser

2009-03-17 Thread Michael Busch
Thanks Grant, I'll go through the template tonight. Luis and Adriano are preparing a patch - they should be ready in a day or two. So we can simply open a Jira issue and attach as a normal patch? -Michael On 3/17/09 12:06 PM, Grant Ingersoll wrote: On Mar 16, 2009, at 7:23 PM, Mi

Re: New flexible query parser

2009-03-17 Thread Michael Busch
OK, sounds good. Thanks, Grant. -Michael On Tue, Mar 17, 2009 at 10:02 AM, Grant Ingersoll wrote: > > On Mar 17, 2009, at 11:18 AM, Michael Busch wrote: > > Thanks Grant, I'll go through the template tonight. >> >> Luis and Adriano are preparing a patch - they shoul

Re: New flexible query parser

2009-03-20 Thread Michael Busch
On 3/20/09 10:58 PM, Chris Hostetter wrote: : My vote for contrib would depend on the state of the code - if it passes all : the tests and is truly back compat, and is not crazy slower, I don't see why : we don't move it in right away depending on confidence levels. That would : ensure use and at

Modularization (was: Re: New flexible query parser)

2009-03-21 Thread Michael Busch
On 3/21/09 12:27 AM, Michael Busch wrote: +1. I'd love to see Lucene going into such a direction. However, I'm a little worried about contrib's reputation. I think it contains components with differing levels of activity, maturity and support. So maybe instead of moving things

Re: Modularization

2009-03-21 Thread Michael Busch
ut where to find them (which jar). Then by clicking on e.g. queries, the user would see the list of all queries we support. But I think we should still have "main modules", such as core, queries, analyzers, ... and separately e.g. "sandbox modules?", for the things currently in

Re: Modularization

2009-03-21 Thread Michael Busch
On 3/21/09 1:36 PM, Michael McCandless wrote: And I don't think the sudden separation of "core" vs "contrib" should be so prominent (or even visible); it's really a detail of how we manage source control. When looking at the website I'd like read that Lucene can do hit highlighting, powerful que

Improve worst-case performance of TrieRange queries

2009-03-23 Thread Michael Busch
Let me give an example to explain my idea - I'm using dates in my example, because it's easier to imagine :) Let's say we have the following posting lists. There are 20 docs in the index and an X means that a doc contains the corresponding term: JanX X Feb XX X Mar

Re: Improve worst-case performance of TrieRange queries

2009-03-24 Thread Michael Busch
Uwe and I talked a little bit about this at the ApacheCon. We figured that this will probably only improve a very small amount of ranges, so as Uwe recommended, this is probably not worth the effort and complexity. Never mind, just an idea :) -Michael On 3/24/09 12:40 AM, Michael Busch wrote

NIO.2

2009-03-28 Thread Michael Busch
NIO.2 sounds great. Though, it will probably take a pretty long time before we can switch Lucene to Java 1.7 :( We could write a (contrib) module that we don't ship together with the core that has a Directory implementation which uses NIO.2. http://jcp.org/en/jsr/detail?id=203 http://ronsoft

Re: Possible IndexInput optimization

2009-03-29 Thread Michael Busch
On 3/29/09 12:43 AM, Earwin Burrfoot wrote: There are three cases when we can override readNNN methods and provide implementations with zero or minimum method invocations - RAMDirectory, MMapDirectory and BufferedIndexInput for FSDirectory/CompoundFileReader. Anybody tried this? A while ag

Re: Modularization

2009-03-30 Thread Michael Busch
On 3/31/09 1:31 AM, Chris Hostetter wrote: code isolation (by directory hierarchy) is hte best way i've seen to ensure modularization, and protect against inadvertent dependency bleeding. +1. That's actually what I meant with "one-to-one mapping between the packaging and the source code" (I didn

Re: Future projects

2009-04-03 Thread Michael Busch
On 4/3/09 3:35 AM, Michael McCandless wrote: It seems like we've been talking about CSF for 2 years and there isn't a patch for it? If I had more time I'd take a look. What is the status of it? I think Michael is looking into it? I'd really like to get it into 2.9. We should do it in co

Re: possible TermInfosReader speedup

2009-04-08 Thread Michael Busch
On 4/8/09 2:08 PM, Earwin Burrfoot wrote: On Thu, Apr 9, 2009 at 00:14, Michael McCandless wrote: On Wed, Apr 8, 2009 at 3:46 PM, Earwin Burrfoot wrote: Currently, when we're seeking a given Term, it does a binary search across all term space, including terms belonging to other fi

Re: Future projects

2009-04-12 Thread Michael Busch
On 4/4/09 4:42 AM, Michael McCandless wrote: As I recently mentioned on 1231 I'm looking into changing the Document and Field APIs. I've some rough prototype. I think we should also try to get it in before 2.9? On the other hand I don't want to block the 2.9 release with too much stuff. T

Re: [Lucene-java Wiki] Update of "LuceneAtApacheConUs2009" by MichaelBusch

2009-04-27 Thread Michael Busch
ait to fill this in until Concom provides us a list from the regular CFP process. = Possible Talks or Tutorials = - * Lucene Basics (Michael Busch) + * Lucene Basics (Michael Busch or others?) * Intro to Solr (: Hoss out of the box talk?) * Intro to Nutch and/or Nutch Vertical Search

Re: new TokenStream api Question

2009-04-28 Thread Michael Busch
Hi Eks Dev, I actually started experimenting with changing the new API slightly to overcome one drawback: with the variables now distributed over various Attribute classes (vs. being in a single class Token previously), cloning a "Token" (i.e. calling captureState()) is more expensive. This s

Re: Lucene's default settings & back compatibility

2009-05-18 Thread Michael Busch
+1. this would be great! Michael On May 18, 2009, at 2:06 PM, Michael McCandless > wrote: As we all know, Lucene's back-compat policy necessarily hurts the out-of-the-box experience for new users: because we are only allowed make substantial improvements to Lucene's default settings at a maj

Re: New Token API was Re: Payloads and TrieRangeQuery

2009-06-14 Thread Michael Busch
On 6/14/09 5:17 AM, Grant Ingersoll wrote: Agreed. I've been bringing it up for a while now and made the same comments when it was first introduced, but felt like the lone voice in the wilderness on it and gave way [1], [2], [3]. Now that others are writing/converting, I think it is worth rev

Re: Build failed in Hudson: Lucene-trunk #859

2009-06-15 Thread Michael Busch
Thanks, Simon! I just committed the fix. Michael On 6/15/09 12:20 AM, Simon Willnauer wrote: There is the wrong name in the pom.xml.template for contrib/remote Here is a diff with a patch: Index: contrib/remote/pom.xml.template

Re: New Token API was Re: Payloads and TrieRangeQuery

2009-06-15 Thread Michael Busch
This is excellent feedback, Robert! I agree this is confusing; especially having a deprecated API and only a experimental one that replaces the old one. We need to change that. And I don't like the *useNewAPI*() methods either. I spent a lot of time thinking about backwards compatibility for th

Re: New Token API was Re: Payloads and TrieRangeQuery

2009-06-15 Thread Michael Busch
I have implemented most of that actually (the interface part and Token implementing all of them). The problem is a paradigm change with the new API: the assumption is that there is always only one single instance of an Attribute. With the old API, it is recommended to reuse the passed-in token

Re: New Token API was Re: Payloads and TrieRangeQuery

2009-06-15 Thread Michael Busch
me chain must use the same API, its discouraging if all the contrib stuff doesn't support the new API, it makes me want to just stick with the old so everything will work. So I think contribs being on the new API is really important otherwise no one will want to use it. On Mon, Jun 15, 2009 at 4:2

Re: New Token API was Re: Payloads and TrieRangeQuery

2009-06-15 Thread Michael Busch
rare TokenStreams that did not reuse token before (and were slow before, too). - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de *From:* Michael Busch [mailto:busch...@gmail.com]

Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
Probably everyone is thinking right now "Oh no! Not again!". I admit I didn't fully read the incredibly long recent thread about backwards-compatibility, so maybe what I'm about to propose has been proposed already. In that case my apologies in advance. Rather than discussing our current backward

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
around changing default settings. But maybe we should take it one step at a time. Shai On Tue, Jun 16, 2009 at 1:37 PM, Michael Busch <mailto:busch...@gmail.com>> wrote: Probably everyone is thinking right now "Oh no! Not again!". I admit I didn't fully read th

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
we should make _reasonable_ efforts to maintain back compatibility. -Grant On Jun 16, 2009, at 6:37 AM, Michael Busch wrote: Probably everyone is thinking right now "Oh no! Not again!". I admit I didn't fully read the incredibly long recent thread about backwards-compatibili

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
e() in version 3.3. This will break any implementations of Fieldable. Still, I'm fine with #4 as stated. Finally, I still think we all agree that when possible, we should make _reasonable_ efforts to maintain back compatibility. -Grant On Jun 16, 2009, at 6:37 AM, Michael Busch wrote:

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
ng, or are we going to warn one release before the change, or do we provide old-behaviour switches that are deprecated since their birth, or we keep said switches for a couple of major releases? On Tue, Jun 16, 2009 at 14:37, Michael Busch wrote: Probably everyone is thinking right now "O

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
From a backwards-compatibility point of view, nothing really. Michael On 6/16/09 8:59 AM, Yonik Seeley wrote: So under this proposal, what's the difference between a major and minor release? -Yonik http://www.lucidimagination.com On Tue, Jun 16, 2009 at 6:37 AM, Michael Busch

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
Except regarding file format compatibility, see 1. On 6/16/09 9:04 AM, Michael Busch wrote: >From a backwards-compatibility point of view, nothing really. Michael On 6/16/09 8:59 AM, Yonik Seeley wrote: So under this proposal, what's the difference between a major and minor release?

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
ate - we should mention how we have lots of exceptions and experimental API's that we use to get around what is there now. I imagine that will continue to an extent. Our policy should allow our current behavior :) - Mark Michael Busch wrote: Fair enough. We certainly want our users to unders

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
On 6/16/09 9:41 AM, DM Smith wrote: Perhaps you should go back and see why the thread died. OK I will read it. I think we should do the following: I'll send the mentioned mail to the user list and wait for feedback. After a decent amount of time for feedback I will call a vote on java-dev w

Re: Lucene 2.9 Again

2009-06-16 Thread Michael Busch
Cool, seems like Mark is volunteering to be the 2.9 release manager ;) I need to get the TokenStream API changes in and ideally LUCENE-1448. How soon is soon? Code freeze in 2-3 weeks or so maybe? Then 7-10 days testing, so 2.9 should be out mid July? Sounds reasonable? Michael On 6/16/09 2

Re: New Token API was Re: Payloads and TrieRangeQuery

2009-06-17 Thread Michael Busch
On 6/15/09 10:10 AM, Grant Ingersoll wrote: But, as Michael M reminded me, it is complex, so please accept my apologies. No worries, Grant! I was not really offended, but rather confused... Thanks for clarifying. Michael

  1   2   3   4   5   6   7   8   9   10   >