[jira] Assigned: (LUCENE-1734) CharReader should delegate reset/mark/markSupported

2009-07-06 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-1734: - Assignee: Uwe Schindler I think this patch looks good. I will commit shortly. > CharRea

[jira] Closed: (LUCENE-1734) CharReader should delegate reset/mark/markSupported

2009-07-06 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler closed LUCENE-1734. - Resolution: Fixed Committed revision 791415. Thanks Koji! > CharReader should delegate reset/

Re: addIndexesNoOptimize

2009-07-06 Thread Michael McCandless
On Mon, Jul 6, 2009 at 2:18 AM, John Wang wrote: > Currently, addIndexesNoOptimize(Directory[] dir) is really really > really fast! (I duplicated my index of 15k docs 200 times and created a 3M > doc index in less than a minute) Perhaps we should handle duplicate > directory names more gracef

[jira] Commented: (LUCENE-1727) Order of stored Fields not maintained

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727473#action_12727473 ] Michael McCandless commented on LUCENE-1727: bq. If we start guaranteeing that

[jira] Assigned: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1726: -- Assignee: Michael McCandless > IndexWriter.readerPool create new segmentReader

Re: Bug in DocInvertedPerField?

2009-07-06 Thread Michael McCandless
Were the two fields that you added to the doc the same field name? In which case, the pos incr gap is in fact needed, even if the fields are pre-analyzed (have TokenStream values)? Mike On Thu, Jul 2, 2009 at 10:25 AM, Shai Erera wrote: > I hit NPE in DocInvertedPerField in the following scenari

Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

2009-07-06 Thread Michael McCandless
Is this the native vs LF svn:eol-style that Uwe already fixed? Mike On Thu, Jul 2, 2009 at 10:03 AM, Shai Erera wrote: > Can somebody try to revert the change and test it on Windows? > > On Thu, Jul 2, 2009 at 4:44 PM, Robert Muir wrote: >> >> well then I have no idea why it doesn't fail. Except

[jira] Commented: (LUCENE-1566) Large Lucene index can hit false OOM due to Sun JRE issue

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727484#action_12727484 ] Michael McCandless commented on LUCENE-1566: Yes, I'll take this. Thanks Simo

[jira] Updated: (LUCENE-1566) Large Lucene index can hit false OOM due to Sun JRE issue

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1566: --- Fix Version/s: 2.9 > Large Lucene index can hit false OOM due to Sun JRE issue > ---

[jira] Assigned: (LUCENE-1566) Large Lucene index can hit false OOM due to Sun JRE issue

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1566: -- Assignee: Michael McCandless (was: Simon Willnauer) > Large Lucene index can

Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

2009-07-06 Thread Robert Muir
yeah, its fixed now. On Mon, Jul 6, 2009 at 7:06 AM, Michael McCandless wrote: > Is this the native vs LF svn:eol-style that Uwe already fixed? > > Mike > > On Thu, Jul 2, 2009 at 10:03 AM, Shai Erera wrote: >> Can somebody try to revert the change and test it on Windows? >> >> On Thu, Jul 2, 2009

Re: Bug in DocInvertedPerField?

2009-07-06 Thread Shai Erera
Yes they have the same field name. Can we use the default posIncr? If I want to create an IndexWriter w/o an Analyzer, why should I be forced to do new IndexWriter(new SimpleAnalyzer() /* for example */ ...), when the analyzer will never be used? It is an edge case though which I can easily reprod

RE: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

2009-07-06 Thread Uwe Schindler
In my opinion, these files should be converted to UTF-8 and committed again (and the Reader in the test recondigured for UTF-8). Then they can be native EOL style again. The problem is that SVN can only handle the EOL style for one-byte-per-char and UTF-8 files. I give it a try here (and I have a

Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

2009-07-06 Thread Robert Muir
Uwe, I think so too. This way it will not be prone to breakage again. On Mon, Jul 6, 2009 at 8:38 AM, Uwe Schindler wrote: > In my opinion, these files should be converted to UTF-8 and committed again > (and the Reader in the test recondigured for UTF-8). Then they can be native > EOL style again.

RE: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

2009-07-06 Thread Uwe Schindler
The whole russian analyzer is very strange and works against all charset/unicode conventions. It defines own "charsets" (the only valid one is UNICODE), which are all applied to standard java 16 bit chars. The test shows, how this works: It open a text file in KOI8 using the "ISO-88591-1" charset (

Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

2009-07-06 Thread Robert Muir
uwe I completely agree. to add the icing on the cake the entire analyzer appears to be just a duplication of the contrib/snowball Russian functionality...! On Mon, Jul 6, 2009 at 9:19 AM, Uwe Schindler wrote: > The whole russian analyzer is very strange and works against all > charset/unicode con

[jira] Commented: (LUCENE-1730) TrecContentSource should use a fixed encoding, rather than system dependent

2009-07-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727524#action_12727524 ] Mark Miller commented on LUCENE-1730: - I havn't patched the code in, but looking at th

[jira] Commented: (LUCENE-1730) TrecContentSource should use a fixed encoding, rather than system dependent

2009-07-06 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727529#action_12727529 ] Shai Erera commented on LUCENE-1730: if (encoding == null) happens in setConfig and th

RE: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

2009-07-06 Thread Uwe Schindler
I fixed the encoding problem by convertig the test files to UTF-8 and changed the Reader charset parameter to UTF-8. All files now have old-style native again. Could somebody check if in unix, the files only have LF (and in windows the files have CRLF, which is the state how I committed it)? The o

[jira] Commented: (LUCENE-1567) New flexible query parser

2009-07-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727531#action_12727531 ] Mark Miller commented on LUCENE-1567: - I wonder if all of this was really necessary. M

[jira] Commented: (LUCENE-1730) TrecContentSource should use a fixed encoding, rather than system dependent

2009-07-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727533#action_12727533 ] Mark Miller commented on LUCENE-1730: - Okay, cool. I'll patch it in, run the tests, an

[jira] Assigned: (LUCENE-1730) TrecContentSource should use a fixed encoding, rather than system dependent

2009-07-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned LUCENE-1730: --- Assignee: Mark Miller > TrecContentSource should use a fixed encoding, rather than system de

Re: Bug in DocInvertedPerField?

2009-07-06 Thread Yonik Seeley
On Mon, Jul 6, 2009 at 7:12 AM, Shai Erera wrote: > If I want to create an IndexWriter w/o an > Analyzer, why should I be forced to do new IndexWriter(new SimpleAnalyzer() Passing an Analyzer really doesn't seem like a hardship... it's the current interface that defines analysis, and it would com

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727567#action_12727567 ] Michael McCandless commented on LUCENE-1726: Can we make the MapValue strongly

[jira] Updated: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1726: --- Fix Version/s: (was: 2.9) 3.1 > IndexWriter.readerPool create

[jira] Commented: (LUCENE-1566) Large Lucene index can hit false OOM due to Sun JRE issue

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727579#action_12727579 ] Michael McCandless commented on LUCENE-1566: Could we move the fix down into S

Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

2009-07-06 Thread Michael McCandless
contrib/analyzers/src/test/org/apache/lucene/analysis/ru/stemsUTF8.txt looks right on OpenSolaris (unix EOLs). Mike On Mon, Jul 6, 2009 at 9:53 AM, Uwe Schindler wrote: > I fixed the encoding problem by convertig the test files to UTF-8 and > changed the Reader charset parameter to UTF-8. All fil

RE: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

2009-07-06 Thread Uwe Schindler
Wonderful, and the tests (TestRussianStems) pass? Thanks, Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Monday, July 06, 2009 5:37 PM >

[jira] Commented: (LUCENE-1566) Large Lucene index can hit false OOM due to Sun JRE issue

2009-07-06 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727595#action_12727595 ] Simon Willnauer commented on LUCENE-1566: - bq. Could we move the fix down into Sim

[jira] Resolved: (LUCENE-1730) TrecContentSource should use a fixed encoding, rather than system dependent

2009-07-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved LUCENE-1730. - Resolution: Fixed > TrecContentSource should use a fixed encoding, rather than system dependent

Re: Bug in DocInvertedPerField?

2009-07-06 Thread Shai Erera
Ok. BTW, maybe we want to ensure then that the Analyzer passed to IndexWriter is not null, since it looks to be a required argument, unless I always addDocument w/ an Analyzer. Thanks for the replies guys. Shai On Mon, Jul 6, 2009 at 5:22 PM, Yonik Seeley wrote: > On Mon, Jul 6, 2009 at 7:12 AM

small faults in new Numeric* class Javadoc

2009-07-06 Thread Koji Sekiguchi
There seems to be trivial faults in javadoc. In NumericRangeQuery, "Filter" should be "Query": - * Filter f = NumericRangeQuery.newFloatRange(field, precisionStep, + * Query query = NumericRangeQuery.newFloatRange(field, precisionStep, And in NumericField, there is an incorrect sample code for Nu

[jira] Commented: (LUCENE-1721) IndexWriter to allow deletion by doc ids

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727610#action_12727610 ] Michael McCandless commented on LUCENE-1721: Right, a merge can commit at any

Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

2009-07-06 Thread Michael McCandless
On Mon, Jul 6, 2009 at 11:40 AM, Uwe Schindler wrote: > Wonderful, and the tests (TestRussianStems) pass? Yup! Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...

[jira] Commented: (LUCENE-1650) Small fix in CustomScoreQuery JavaDoc

2009-07-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727612#action_12727612 ] Mark Miller commented on LUCENE-1650: - No I'm not :) Yonik, could you take a peak at t

RE: small faults in new Numeric* class Javadoc

2009-07-06 Thread Uwe Schindler
Thanks, I fix. It is just copy'n'paste errors! - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Koji Sekiguchi [mailto:k...@r.email.ne.jp] > Sent: Monday, July 06, 2009 6:18 PM > To: java-dev@lucene.apach

[jira] Updated: (LUCENE-1650) Small fix in CustomScoreQuery JavaDoc

2009-07-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1650: Attachment: LUCENE-1650.patch updated to trunk in any case. > Small fix in CustomScoreQuery JavaD

[jira] Commented: (LUCENE-1721) IndexWriter to allow deletion by doc ids

2009-07-06 Thread Tim Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727614#action_12727614 ] Tim Smith commented on LUCENE-1721: --- One thing that would be nice to see is a boolean re

[jira] Commented: (LUCENE-1721) IndexWriter to allow deletion by doc ids

2009-07-06 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727620#action_12727620 ] Yonik Seeley commented on LUCENE-1721: -- bq. obviously, this is rather impractical as

[jira] Updated: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-07-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1486: Attachment: LUCENE-1486.patch Whoops - almost let some 1.5 slip by: throw new IllegalArgumentExc

[jira] Commented: (LUCENE-1721) IndexWriter to allow deletion by doc ids

2009-07-06 Thread Tim Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727623#action_12727623 ] Tim Smith commented on LUCENE-1721: --- bq. Sounds like you could perhaps use reopen() or t

A Comparison of Open Source Search Engines

2009-07-06 Thread Sean Owen
http://zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter/ I imagine many of you already saw this -- Lucene does pretty well in this "shootout". The only area it tended to lag, it seems, is memory usage and speed in some cases. -

[jira] Commented: (LUCENE-1721) IndexWriter to allow deletion by doc ids

2009-07-06 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727639#action_12727639 ] Jason Rutherglen commented on LUCENE-1721: -- I'm still of the somewhat naive opini

Re: addIndexesNoOptimize

2009-07-06 Thread Jason Rutherglen
> MergePolicy expects to receive SegmentInfo instances I ran into this implementing LUCENE-1589. On Mon, Jul 6, 2009 at 3:18 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Mon, Jul 6, 2009 at 2:18 AM, John Wang wrote: > > > Currently, addIndexesNoOptimize(Directory[] dir) is

[jira] Commented: (LUCENE-1721) IndexWriter to allow deletion by doc ids

2009-07-06 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727641#action_12727641 ] Yonik Seeley commented on LUCENE-1721: -- bq. but some custom caches may not work on a

[jira] Commented: (LUCENE-1721) IndexWriter to allow deletion by doc ids

2009-07-06 Thread Tim Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727646#action_12727646 ] Tim Smith commented on LUCENE-1721: --- Absolutely nothing would have to have actually chan

[jira] Reopened: (LUCENE-1591) Enable bzip compression in benchmark

2009-07-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reopened LUCENE-1591: - Assignee: Mark Miller Lucene Fields: [New, Patch Available] (was: [New]) some java 1.5

[jira] Updated: (LUCENE-1591) Enable bzip compression in benchmark

2009-07-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1591: Attachment: LUCENE-1591.patch Looks like this spread a little in the docmaker/contentsource breaku

[jira] Commented: (LUCENE-1591) Enable bzip compression in benchmark

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727668#action_12727668 ] Michael McCandless commented on LUCENE-1591: Thank Mark! > Enable bzip compre

[jira] Commented: (LUCENE-1650) Small fix in CustomScoreQuery JavaDoc

2009-07-06 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727675#action_12727675 ] Yonik Seeley commented on LUCENE-1650: -- Not sure why you wanted me to take a peek - t

[jira] Commented: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-07-06 Thread Mark Harwood (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727685#action_12727685 ] Mark Harwood commented on LUCENE-1486: -- Hi Mark, Mind if I try committing this patch?

[jira] Commented: (LUCENE-1721) IndexWriter to allow deletion by doc ids

2009-07-06 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727683#action_12727683 ] Yonik Seeley commented on LUCENE-1721: -- bq. Absolutely nothing would have to have act

[jira] Commented: (LUCENE-1650) Small fix in CustomScoreQuery JavaDoc

2009-07-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727688#action_12727688 ] Mark Miller commented on LUCENE-1650: - bq. Not sure why you wanted me to take a peek -

[jira] Updated: (LUCENE-1650) Small fix in CustomScoreQuery JavaDoc

2009-07-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1650: Affects Version/s: (was: 3.0) (was: 2.9) Fix Version/s:

[jira] Commented: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-07-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727692#action_12727692 ] Mark Miller commented on LUCENE-1486: - Please, by all means ! :) > Wildcards, ORs etc

[jira] Assigned: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-07-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned LUCENE-1486: --- Assignee: Mark Harwood (was: Mark Miller) > Wildcards, ORs etc inside Phrase queries >

[jira] Commented: (LUCENE-1721) IndexWriter to allow deletion by doc ids

2009-07-06 Thread Tim Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727695#action_12727695 ] Tim Smith commented on LUCENE-1721: --- That looks like its pretty close, and is definitely

[jira] Updated: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-06 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-1726: - Attachment: LUCENE-1726.patch * New SRMapValue is strongly typed * All tests pass {quo

[jira] Updated: (LUCENE-1609) Eliminate synchronization contention on initial index reading in TermInfosReader ensureIndexIsRead

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1609: --- Attachment: LUCENE-1609.patch Attached patch. This addresses this issue and LUCENE-

[jira] Resolved: (LUCENE-1591) Enable bzip compression in benchmark

2009-07-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved LUCENE-1591. - Resolution: Fixed committed > Enable bzip compression in benchmark > --

Re: Execute a testcase method via ant?

2009-07-06 Thread Michael McCandless
I would love to have the -Dtestmethod=XXX! Mike On Tue, Jun 23, 2009 at 7:42 PM, Jason Rutherglen wrote: > More like ant test -Dtestcase=TestSort -Dtestmethod=testMultiSort > > or > > ant test -Dtestcase=TestSort.testMultiSort > > I Googled a lot for "ant junit test method" and variants.  Couldn'

[jira] Commented: (LUCENE-1704) org.apache.lucene.ant.HtmlDocument added Tidy config file passthrough availability

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727723#action_12727723 ] Michael McCandless commented on LUCENE-1704: There is a preview button (that s

[jira] Closed: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-07-06 Thread Mark Harwood (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Harwood closed LUCENE-1486. Resolution: Fixed Committed in 791579 - http://svn.apache.org/viewvc?rev=791579&view=rev > Wildc

[jira] Updated: (LUCENE-1704) org.apache.lucene.ant.HtmlDocument added Tidy config file passthrough availability

2009-07-06 Thread Keith Sprochi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Sprochi updated LUCENE-1704: -- Description: Parsing HTML documents using the org.apache.lucene.ant.HtmlDocument.Document met

[jira] Commented: (LUCENE-1522) another highlighter

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727728#action_12727728 ] Michael McCandless commented on LUCENE-1522: Is it possible to decouple this i

[jira] Resolved: (LUCENE-1704) org.apache.lucene.ant.HtmlDocument added Tidy config file passthrough availability

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1704. Resolution: Fixed Thanks Keith! > org.apache.lucene.ant.HtmlDocument added Tidy c

[jira] Commented: (LUCENE-1704) org.apache.lucene.ant.HtmlDocument added Tidy config file passthrough availability

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727732#action_12727732 ] Michael McCandless commented on LUCENE-1704: OK the patch looks good -- I'll c

[jira] Commented: (LUCENE-1566) Large Lucene index can hit false OOM due to Sun JRE issue

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727740#action_12727740 ] Michael McCandless commented on LUCENE-1566: bq. I did hit the error while I d

Re: Execute a testcase method via ant?

2009-07-06 Thread Jason Rutherglen
I'll make an issue for testing by method, it should be easier to implement than multithreading JUnit (which seems to require core ANT/JUnit work). On Mon, Jul 6, 2009 at 12:26 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > I would love to have the -Dtestmethod=XXX! > > Mike > > On T

Re: A Comparison of Open Source Search Engines

2009-07-06 Thread John Wang
Vik did a very nice job.One thing the experiment did not mention is that Lucene handles incremental updates, whereas many of the other "competitors" do not. So the indexing performance comparison is not really fair. -John On Mon, Jul 6, 2009 at 8:06 AM, Sean Owen wrote: > > http://zooie.wordpre

Re: A Comparison of Open Source Search Engines

2009-07-06 Thread Earwin Burrfoot
I'd say out of these libraries only Lucene and Sphinx are worth mentioning. There's also MG4J, which wasn't covered and has a nice algorithmic background. Anybody knows other interesting open-source search engines? On Tue, Jul 7, 2009 at 00:39, John Wang wrote: > Vik did a very nice job. > One th

[jira] Created: (LUCENE-1735) IndexReader.reopen() does not retain TermInfosIndexDivisor setting for newly opened segments

2009-07-06 Thread Tim Smith (JIRA)
IndexReader.reopen() does not retain TermInfosIndexDivisor setting for newly opened segments Key: LUCENE-1735 URL: https://issues.apache.org/jira/browse/LUCENE-1735

Re: A Comparison of Open Source Search Engines

2009-07-06 Thread eks dev
> Anybody knows other interesting open-source search engines? Minion (https://minion.dev.java.net/) - Original Message > From: Earwin Burrfoot > To: java-dev@lucene.apache.org > Sent: Monday, 6 July, 2009 23:01:52 > Subject: Re: A Comparison of Open Source Search Engines > > I'd sa

[jira] Resolved: (LUCENE-1735) IndexReader.reopen() does not retain TermInfosIndexDivisor setting for newly opened segments

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1735. Resolution: Duplicate This is a dup of LUCENE-1718. > IndexReader.reopen() does n

[jira] Commented: (LUCENE-1718) IndexReader.setTermInfosIndexDivisor doesn't carry over to reopened readers

2009-07-06 Thread Tim Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727791#action_12727791 ] Tim Smith commented on LUCENE-1718: --- perfect i had checked your last patch on LUCENE-16

[jira] Commented: (LUCENE-1718) IndexReader.setTermInfosIndexDivisor doesn't carry over to reopened readers

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727792#action_12727792 ] Michael McCandless commented on LUCENE-1718: Thanks Tim. This should be fixed

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727813#action_12727813 ] Michael McCandless commented on LUCENE-1726: The hazard is something like this

[jira] Updated: (LUCENE-1727) Order of stored Fields not maintained

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1727: --- Attachment: LUCENE-1727.patch Attached patch. I moved StoredFieldsWriter up in the

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-06 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727823#action_12727823 ] Jason Rutherglen commented on LUCENE-1726: -- Shouldn't we be seeing an exception i

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727843#action_12727843 ] Michael McCandless commented on LUCENE-1726: Yes, we should eventually see a f

[jira] Assigned: (LUCENE-1717) IndexWriter does not properly account for the RAM consumed by pending deletes

2009-07-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1717: -- Assignee: Michael McCandless > IndexWriter does not properly account for the R

[jira] Updated: (LUCENE-1522) another highlighter

2009-07-06 Thread Koji Sekiguchi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated LUCENE-1522: --- Attachment: LUCENE-1522.patch Thank you for your advice, Michael. bq. because they test mul

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-06 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727882#action_12727882 ] Jason Rutherglen commented on LUCENE-1726: -- When I moved the sync block around in

Re: A Comparison of Open Source Search Engines

2009-07-06 Thread John Wang
mg4j is a nice project. It is missing the incremental aspects as well.The "older" paper this experiment mentioned contains lucene-mg4j comparisons. -John On Mon, Jul 6, 2009 at 2:01 PM, Earwin Burrfoot wrote: > I'd say out of these libraries only Lucene and Sphinx are worth mentioning. > > Ther