[jira] Updated: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-24 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-2455: --- Attachment: LUCENE-2455_3x.patch Patch applies Mike's comments. I think this is ready to go in. I'd

[jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-24 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12871023#action_12871023 ] Shai Erera commented on LUCENE-2455: bq. CFW's comment should be "make it 1 lower" Ri

[jira] Updated: (SOLR-1870) Binary Update Request (javabin) fails when the field type of a multivalued SolrInputDocument field is a Set (or any type that is identified as an instance of iterable)

2010-05-24 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-1870: - Attachment: SOLR-1870.patch fixing JavabinCodec to write collection as array > Binary Update Request (jav

[jira] Commented: (SOLR-1870) Binary Update Request (javabin) fails when the field type of a multivalued SolrInputDocument field is a Set (or any type that is identified as an instance of iterable)

2010-05-24 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12871001#action_12871001 ] Noble Paul commented on SOLR-1870: -- bq. "top level" there will be an Iterator of docs, so i

DocBuilder inefficiency?

2010-05-24 Thread Robert Zotter
I am looking into collectDelta method in DocBuilder.java and I noticed that to determine the deltaRemoveSet it currently loops through the whole deltaSet for each deleted row. (Version 1.4.0 line 641) Does anyone else agree with the fact that this is quite inefficient? For delta-imports with a l

[jira] Updated: (SOLR-1923) add caverphone to phoneticfilter

2010-05-24 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated SOLR-1923: -- Attachment: SOLR-1923.patch > add caverphone to phoneticfilter > > >

[jira] Created: (SOLR-1923) add caverphone to phoneticfilter

2010-05-24 Thread Robert Muir (JIRA)
add caverphone to phoneticfilter Key: SOLR-1923 URL: https://issues.apache.org/jira/browse/SOLR-1923 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 3.1

NPE Within IndexWriter.optimize (Solr Trunk Nightly)

2010-05-24 Thread Chris Herron
Hi, I'm using the latest nightly build of solr (apache-solr-2010-05-24_08-05-13) and am repeatedly experiencing a NullPointerException after calling delete, commit, optimize. Stack trace below. The index is ~20Gb. I'm not doing Lucene/Solr core development - I just figured this was a better pl

[jira] Updated: (LUCENE-2380) Add FieldCache.getTermBytes, to load term data as byte[]

2010-05-24 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2380: --- Attachment: LUCENE-2380.patch New iteration attached. I got Solr mostly cutover, at

RE: Solr updateRequestHandler and performance vs. atomicity

2010-05-24 Thread karl.wright
The reason for this is simple. LCF keeps track of which documents it has handed off to Solr, and has a fairly involved mechanism for making sure that every document LCF *thinks* got there, actually does. It even uses a mechanism akin to a 2-phase commit to make sure that its internal records a

RE: .Net, Lucene and IKVM

2010-05-24 Thread Digy
This is an unresolved old topic. http://www.mail-archive.com/lucene-net-u...@incubator.apache.org/msg00872.html DIGY -Original Message- From: Andrzej Bialecki [mailto:a...@getopt.org] Sent: Tuesday, May 25, 2010 12:32 AM To: dev@lucene.apache.org Subject: .Net, Lucene and IKVM Hi all,

.Net, Lucene and IKVM

2010-05-24 Thread Andrzej Bialecki
Hi all, I'm glad to report that I was able to compile Lucene branch_3x with a recent snapshot of IKVM, and after trying out the Lucene demo apps both the IndexFiles and SearchFiles applications appear to run flawlessly. Environment is WinXP/SP2, .Net CLR 2.0, 3.0, 3.5, and IKVM downloaded from htt

[jira] Commented: (LUCENE-2413) Consolidate all (Solr's & Lucene's) analyzers into modules/analysis

2010-05-24 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870846#action_12870846 ] Robert Muir commented on LUCENE-2413: - By the way, one idea could be to make benchmark

Re: Welcome Andrzej Bialecki as Lucene/Solr committer

2010-05-24 Thread Yonik Seeley
On Mon, May 24, 2010 at 5:33 AM, Michael McCandless wrote: > I'm happy to announce that the PMC has accepted Andrzej Bialecki as > Lucene/Solr committer! > > Welcome aboard Andrzej, An enthusiastic jet lagged +1 ;-) -Yonik http://www.lucidimagination.com

[jira] Commented: (LUCENE-2413) Consolidate all (Solr's & Lucene's) analyzers into modules/analysis

2010-05-24 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870843#action_12870843 ] Robert Muir commented on LUCENE-2413: - {quote} contrib/benchmark's NewShingleAnalyzerT

[jira] Commented: (LUCENE-2413) Consolidate all (Solr's & Lucene's) analyzers into modules/analysis

2010-05-24 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870842#action_12870842 ] Doron Cohen commented on LUCENE-2413: - contrib/benchmark's NewShingleAnalyzerTask depe

Re: Solr updateRequestHandler and performance vs. atomicity

2010-05-24 Thread Mark Miller
Indexing a doc won't be as fast as raw disk IO. But you won't be doing just raw disk IO to guarantee acceptance. And that will have a cost and complexity that really makes me wonder if its worth the speed advantage. For very large documents with complex analyzers...perhaps. But its not going to

Re: Solr updateRequestHandler and performance vs. atomicity

2010-05-24 Thread Simon Willnauer
Hi Karl, what are you describing seems to be a good usecase for something like a message queue where you push a document or record to a queue which guarantees the queues persistence. I look at this from a little different perspective, in a distributed environment you would have to guarantee delive

RE: Solr updateRequestHandler and performance vs. atomicity

2010-05-24 Thread karl.wright
Hi Mark, Unfortunately, indexing performance *is* of concern, otherwise I'd already be committing on every post. If your guess is correct, you are basically saying that adding a document to an index in Solr/Lucene is just as fast as writing that file directly to the disk. Because, obviously,

Re: Solr updateRequestHandler and performance vs. atomicity

2010-05-24 Thread Mark Miller
On 5/24/10 3:10 PM, karl.wri...@nokia.com wrote: Hi all, It seems to me that the “commit” logic in the Solr updateRequestHandler (or wherever the logic is actually located) conflates two different semantics. One semantic is what you need to do to make the index process perform well. The other sem

[jira] Updated: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-24 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2458: Attachment: LUCENE-2458.patch updated patch that cuts over the remaining two qps: the flexible que

mingw /implib:foo.lib equivalent ?

2010-05-24 Thread Andi Vajda
Hi Bill, Would you know what the equivalent mingw gcc flag for MSVC's /implib:foo.lib flag is ? This overrides the default name and location that the linker uses to produce a DLLs' import library. I added some linking tricks on Windows and Linux for supporting the new --import funtionality

[jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-24 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870761#action_12870761 ] Shai Erera commented on LUCENE-2455: I will document it in CHANGES under API section.

Re: TestBackwardsCompatibility

2010-05-24 Thread Shai Erera
Oops :z = 3x Shai On Monday, May 24, 2010, Shai Erera wrote: > So do we want to just remove the 1x indexes from :z and 2x from trunk? > Or do we also want to remove the live migration code? How can one > start with that for example? Are there constants to look for for > example? > > Shai > > On

[jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-24 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870743#action_12870743 ] Michael McCandless commented on LUCENE-2455: bq. With that behind us, did some

[jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-24 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870735#action_12870735 ] Shai Erera commented on LUCENE-2455: Ahh, I knew we must be talking past each other :)

[jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-24 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870724#action_12870724 ] Michael McCandless commented on LUCENE-2455: Sorry -- for each major release,

Re: TestBackwardsCompatibility

2010-05-24 Thread Shai Erera
So do we want to just remove the 1x indexes from :z and 2x from trunk? Or do we also want to remove the live migration code? How can one start with that for example? Are there constants to look for for example? Shai On Monday, May 24, 2010, Mark Miller wrote: > On 5/24/10 11:25 AM, Michael McCan

[jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-24 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870688#action_12870688 ] Shai Erera commented on LUCENE-2455: I'm not sure about the live migration, Mike. Firs

[jira] Commented: (LUCENE-1622) Multi-word synonym filter (synonym expansion at indexing time).

2010-05-24 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870633#action_12870633 ] Robert Muir commented on LUCENE-1622: - {quote} We'd then need an AutomatonWordQuery -

[jira] Commented: (SOLR-1852) enablePositionIncrements="true" can cause searches to fail when they are parsed as phrase queries

2010-05-24 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870631#action_12870631 ] Robert Muir commented on SOLR-1852: --- Also, Mark mentioned to me he had concerns about 'ind

[jira] Commented: (SOLR-1852) enablePositionIncrements="true" can cause searches to fail when they are parsed as phrase queries

2010-05-24 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870628#action_12870628 ] Robert Muir commented on SOLR-1852: --- bq. now this has been in trunk longer, do you feel an

[jira] Commented: (LUCENE-1622) Multi-word synonym filter (synonym expansion at indexing time).

2010-05-24 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870629#action_12870629 ] Uwe Schindler commented on LUCENE-1622: --- In my opinion, we should also have a very s

[jira] Commented: (SOLR-1852) enablePositionIncrements="true" can cause searches to fail when they are parsed as phrase queries

2010-05-24 Thread Peter Wolanin (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870624#action_12870624 ] Peter Wolanin commented on SOLR-1852: - now this has been in trunk longer, do you feel an

Re: Welcome Andrzej Bialecki as Lucene/Solr committer

2010-05-24 Thread Simon Willnauer
Welcome Andrzej! simon On Mon, May 24, 2010 at 11:36 AM, Uwe Schindler wrote: > Welcome Andrzej! I am glad to have you finally on the Team :-) > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > >> -Original Message- >>

[jira] Commented: (LUCENE-1622) Multi-word synonym filter (synonym expansion at indexing time).

2010-05-24 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870619#action_12870619 ] Michael McCandless commented on LUCENE-1622: bq. For other reasons, including

[jira] Updated: (LUCENE-2286) enable DefaultSimilarity.setDiscountOverlaps by default

2010-05-24 Thread Koji Sekiguchi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated LUCENE-2286: --- Fix Version/s: 3.1 according to CHANGES.txt, this fix is in branch_3x as well. > enable Def

[jira] Commented: (LUCENE-2091) Add BM25 Scoring to Lucene

2010-05-24 Thread Yuval Feinstein (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870605#action_12870605 ] Yuval Feinstein commented on LUCENE-2091: - @Vinay - I have this suggestion. I am u

Re: Solr updateRequestHandler and performance vs. atomicity

2010-05-24 Thread Peter Wolanin
We us an autocommit with Solr and I've had this worry too - apparently if you get a hard crash Solr will roll back the not-yet-committed docs. I don't think it's happened more than once in a year, but still possible. -Peter On Mon, May 24, 2010 at 9:10 AM, wrote: > Hi all, > > It seems to me t

Solr updateRequestHandler and performance vs. atomicity

2010-05-24 Thread karl.wright
Hi all, It seems to me that the "commit" logic in the Solr updateRequestHandler (or wherever the logic is actually located) conflates two different semantics. One semantic is what you need to do to make the index process perform well. The other semantic is guaranteed atomicity of document rec

Re: Sorting on Facet Fields

2010-05-24 Thread MitchK
I talked to some people that need another sorting-option as well. At the moment, it is not possible to sort custom out-of-the-box. I don't need such a feature right now, but it would be a nice-to-have. If there is some more interest on the mailing list, I will register at the JIRA and open an is

[jira] Commented: (LUCENE-1622) Multi-word synonym filter (synonym expansion at indexing time).

2010-05-24 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870593#action_12870593 ] Robert Muir commented on LUCENE-1622: - bq. There are tricky tradeoffs of index time vs

[jira] Commented: (LUCENE-1622) Multi-word synonym filter (synonym expansion at indexing time).

2010-05-24 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870588#action_12870588 ] Michael McCandless commented on LUCENE-1622: Here's the dev thread that lead t

Re: TestBackwardsCompatibility

2010-05-24 Thread Mark Miller
On 5/24/10 11:25 AM, Michael McCandless wrote: Yes, I think we can remove support for 1.9 indexes as of 3.0: http://wiki.apache.org/lucene-java/BackwardsCompatibility So starting with 3.0 the oldest index we must support are those written by 2.0. Mike On Sun, May 23, 2010 at 12:56 AM, Sh

[jira] Updated: (LUCENE-2471) Supporting bulk copies in Directory

2010-05-24 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2471: --- Fix Version/s: 3.1 4.0 > Supporting bulk copies in Directory > --

[jira] Commented: (LUCENE-2471) Supporting bulk copies in Directory

2010-05-24 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870560#action_12870560 ] Michael McCandless commented on LUCENE-2471: I think this issue makes sense, s

[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)

2010-05-24 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870559#action_12870559 ] Michael McCandless commented on LUCENE-2474: Should we rename this to "CloseEv

[jira] Commented: (LUCENE-2272) PayloadNearQuery has hardwired explanation for 'AveragePayloadFunction'

2010-05-24 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870551#action_12870551 ] Michael McCandless commented on LUCENE-2272: Thanks Peter -- this looks import

[jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-24 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870549#action_12870549 ] Michael McCandless commented on LUCENE-2455: bq. Backwards support should be m

[jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-24 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870548#action_12870548 ] Michael McCandless commented on LUCENE-2455: Patch looks great! So awesome se

RE: Welcome Andrzej Bialecki as Lucene/Solr committer

2010-05-24 Thread Uwe Schindler
Welcome Andrzej! I am glad to have you finally on the Team :-) - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Monday, May 24, 2010 11:34 AM

Welcome Andrzej Bialecki as Lucene/Solr committer

2010-05-24 Thread Michael McCandless
I'm happy to announce that the PMC has accepted Andrzej Bialecki as Lucene/Solr committer! Welcome aboard Andrzej, Mike - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene

RE: TestBackwardsCompatibility

2010-05-24 Thread Uwe Schindler
But as of 3.0.0 it still supports those indexes :-) So wanna remove in 3.1? - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Monday, May 24, 20

Re: TestBackwardsCompatibility

2010-05-24 Thread Michael McCandless
Yes, I think we can remove support for 1.9 indexes as of 3.0: http://wiki.apache.org/lucene-java/BackwardsCompatibility So starting with 3.0 the oldest index we must support are those written by 2.0. Mike On Sun, May 23, 2010 at 12:56 AM, Shai Erera wrote: > Hi > > I'm working on adding su